Seattle lidar¶
Seattle Lidar¶
Visualize Lidar Scattered Point Elevation Data¶
This notebook uses Datashader to visualize Lidar elevation data from the Puget Sound Lidar consortium, a source of Lidar data for the Puget Sound region of the state of Washington, U.S.A.
Lidar Elevation Data¶
Example X,Y,Z scattered point elevation data from the unpacked 7zip files (unpacked as .gnd files)
! head ../data/q47122d2101.gnd
X,Y,Z
1291149.60,181033.64,467.95
1291113.29,181032.53,460.24
1291065.38,181035.74,451.41
1291113.16,181037.32,455.51
1291116.68,181037.42,456.20
1291162.42,181038.90,467.81
1291111.90,181038.15,454.89
1291066.62,181036.73,451.41
1291019.10,181035.20,451.64
The Seattle area example below loads 25 .gnd
elevation files like the one above. We'll download, cache and read the data using intake
.
import intake
cat = intake.open_catalog('./catalog.yml')
list(cat)
lidar = cat.seattle_lidar()
df = lidar.to_dask()
df.head()
Geographic Metadata¶
Since the data are geographic, we need to know the coordinate reference system (CRS) of the X and Y. All the geographic metadata is stored in the data source entry in the intake catalog.
from pyproj import Proj, transform
lidar.metadata['crs']
washington_state_plane = Proj(init='epsg:2855') # Washington State Plane North EPSG code
web_mercator = Proj(init='epsg:3857') # Mercator projection EPSG code
FT_2_M = 0.3048
def convert_coords(df):
lon, lat = transform(washington_state_plane, web_mercator, df.X.values * FT_2_M, df.Y.values * FT_2_M)
df['meterswest'], df['metersnorth'] = lon, lat
return df[['meterswest', 'metersnorth', 'Z']]
Try out the convert_coords function on a subset of the data
convert_coords(df.head())
Convert the coordinates¶
Since our real dataset is large and partitioned using dask, we need to think about how to apply the convert_coords function to our data.
import dask.distributed
import dask.delayed
dask.distributed.Client()
Explore the task graph to figure out a performant way to split up the coords conversion. First we'll try with using dask.delayed
.
dask.delayed(convert_coords)(df) \
.visualize()
We can see that even though we thought dask.delayed
would help, in actuality we would be requiring all the processes to be done first and then the conversion would happen on the whole dataset in one go. Another approach would be to use dask.map_partitions
to do the conversion on each piece of the data.
df_merc = df.map_partitions(convert_coords)
df_merc.visualize()