Walker Lake#

datashadergeoviewsbokehxarraydask
Published: October 11, 2018 · Modified: December 4, 2024


The Disappearing Walker Lake#

While the loss of the Aral Sea in Kazakhstan and Lake Urmia in Iran have received a lot of attention over the last few decades, this trend is a global phenomena. Reciently a number of papers have been published including one focusing on the Decline of the world’s saline lakes. Many of these lakes have lost the majority of their volume over the last century, including Walker Lake (Nevada, USA) which has lost 90 percent of its volume over the last 100 years.

The following example is intended to replicate the typical processing required in change detection studies similar to the Decline of the world’s saline lakes.

from pathlib import Path

import geoviews as gv
import holoviews as hv
import numpy as np
import rioxarray
import xarray as xr

import cartopy.crs as ccrs
from colorcet import coolwarm
from holoviews import opts
from holoviews.operation.datashader import rasterize
from IPython.display import display

hv.extension('bokeh')

In this example, we would like to use Dask to demonstrate how image processing can be distributed across workers, either running locally or across a cluster. In the next cell, we instantiate a Dask distributed Client where we request eight, single-threaded workers and declare a memory limit of 8GB per worker. You can experiment with different memory limits (e.g 4GB) and different numbers of workers but note that each worker should only use one thread as Datashader manages its own parallelization using Numba:

# arbitrarily choose a memory limit (8GB) to demonstrate the out of core
# processing infrastructure
from dask.distributed import Client
client = Client(memory_limit=8*1e9, n_workers=8, threads_per_worker=1)
# As Datashader uses parallel Numba for raster rendering, we need to use
# single threaded Dask workers on each CPU to avoid contention.
client

Client

Client-9342b987-b258-11ef-88a6-000d3a313db5

Connection method: Cluster object Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status

Cluster Info

Landsat Image Data#

To replicate this study, we first have to obtain the data from primary sources. The conventional way to obtain Landsat image data is to download it through USGS’s EarthExplorer or NASA’s Giovanni, but to facilitate the example two images have been downloaded from EarthExployer and cached.

The two images used by the original study are LT05_L1TP_042033_19881022_20161001_01_T1 and LC08_L1TP_042033_20171022_20171107_01_T1 from 1988/10/22 and 2017/10/22 respectively. These images contain Landsat Surface Reflectance Level-2 Science Product images.

Loading into xarray#

In the next cells, we load the Landsat-5 and Landsat-8 files into xarray DataArray objects, reading them locally using rioxarray.

def read_landsat_files(pattern):
    data_dir = Path('data')
    data = {
        int(file.stem[-1]): rioxarray.open_rasterio(file, chunks={"x": 1200, "y": 1200}, masked=True)
        for file in sorted(data_dir.glob(pattern))
    }
    dataset = xr.concat(data.values(), dim="band")
    dataset = dataset.assign_coords({"band": list(data)})
    return dataset
landsat_5_img = read_landsat_files('LT05*')
landsat_5_img
<xarray.DataArray (band: 2, y: 7241, x: 7961)> Size: 461MB
dask.array<concatenate, shape=(2, 7241, 7961), dtype=float32, chunksize=(1, 1200, 1200), chunktype=numpy.ndarray>
Coordinates:
  * x            (x) float64 64kB 2.424e+05 2.424e+05 ... 4.812e+05 4.812e+05
  * y            (y) float64 58kB 4.414e+06 4.414e+06 ... 4.197e+06 4.197e+06
    spatial_ref  int64 8B 0
  * band         (band) int64 16B 4 5
Attributes:
    Band_1:         band 4 surface reflectance
    AREA_OR_POINT:  Area
    scale_factor:   1.0
    add_offset:     0.0
    long_name:      band 4 surface reflectance
landsat_8_img = read_landsat_files('LC08*')
landsat_8_img
<xarray.DataArray (band: 2, y: 7941, x: 7821)> Size: 497MB
dask.array<concatenate, shape=(2, 7941, 7821), dtype=float32, chunksize=(1, 1200, 1200), chunktype=numpy.ndarray>
Coordinates:
  * x            (x) float64 63kB 2.433e+05 2.433e+05 ... 4.779e+05 4.779e+05
  * y            (y) float64 64kB 4.426e+06 4.426e+06 ... 4.188e+06 4.188e+06
    spatial_ref  int64 8B 0
  * band         (band) int64 16B 4 5
Attributes:
    Band_1:         band 4 surface reflectance
    AREA_OR_POINT:  Area
    scale_factor:   1.0
    add_offset:     0.0
    long_name:      band 4 surface reflectance

We create a cartopy coordinate reference system (EPSG:32611) that we will be using later on in this notebook:

assert landsat_5_img.rio.crs == landsat_8_img.rio.crs
print(landsat_5_img.rio.crs)
crs = ccrs.epsg(landsat_5_img.rio.crs.to_epsg())
EPSG:32611

Computing the NDVI (1988)#

Now let us compute the NDVI for the 1988 image.

ndvi5 = (landsat_5_img.sel(band=5) - landsat_5_img.sel(band=4))/(landsat_5_img.sel(band=5) + landsat_5_img.sel(band=4))
client.persist(ndvi5)
<xarray.DataArray (y: 7241, x: 7961)> Size: 231MB
dask.array<truediv, shape=(7241, 7961), dtype=float32, chunksize=(1200, 1200), chunktype=numpy.ndarray>
Coordinates:
  * x            (x) float64 64kB 2.424e+05 2.424e+05 ... 4.812e+05 4.812e+05
  * y            (y) float64 58kB 4.414e+06 4.414e+06 ... 4.197e+06 4.197e+06
    spatial_ref  int64 8B 0

Computing the NDVI (2017)#

Now we can do this for the Landsat 8 files for the 2017 image:

ndvi8 = (landsat_8_img.sel(band=5) - landsat_8_img.sel(band=4))/(landsat_8_img.sel(band=5) + landsat_8_img.sel(band=4))

Resampling to same size#

The two images share the same coordinate system but do not have the exact same dimensions or coordinates. Previous versions of this notebook resampled the images to the same size, and optionally allowed to regrid them, all using Datashader. In this version, we now interpolate the Landsat-8 image to fit onto the coordinates of the Landsat-5 one using xarray, approach that provides a similar result.

ndvi8 = ndvi8.interp_like(ndvi5, method="nearest")
client.persist(ndvi8)
<xarray.DataArray (y: 7241, x: 7961)> Size: 231MB
dask.array<transpose, shape=(7241, 7961), dtype=float32, chunksize=(7241, 7961), chunktype=numpy.ndarray>
Coordinates:
    spatial_ref  int64 8B 0
  * x            (x) float64 64kB 2.424e+05 2.424e+05 ... 4.812e+05 4.812e+05
  * y            (y) float64 58kB 4.414e+06 4.414e+06 ... 4.197e+06 4.197e+06

Viewing change via dropdown#

Using Datashader together with GeoViews, we can now easily build an interactive visualization where we select between the 1988 and 2017 images. The use of datashader allows these images to be dynamically updated according to zoom level (Note: it can take datashader a minute to ‘warm up’ before it becomes fully interactive). For more information on how the dropdown widget was created using HoloMap, please refer to the HoloMap reference.

opts.defaults(
    opts.Curve(width=600, tools=['hover']),
    opts.Image(cmap='viridis', width=450, height=450, tools=['hover'], colorbar=True))
hmap = hv.HoloMap({'1988':gv.Image(ndvi5, crs=crs, vdims=['ndvi'], rtol=10), 
                   '2017':gv.Image(ndvi8, crs=crs, vdims=['ndvi'], rtol=10)}, 
                  kdims=['Year']).redim(x='lon', y='lat') # Mapping 'x' and 'y' from rasterio to 'lon' and 'lat'
%%time
display(rasterize(hmap))