H3 Grid#
Build a H3 grid for the San Diego region#
H3 grid
This dataset contains a hexagonal tesselation following Uber’s H3 system that covers the core of San Diego.
The delineation of the city centre is somewhat arbitrary, was manually drawn using geojson.io, and is stored in sd_city_centre.geojson in this folder.
Source: UberURL
Processing: Generation of the dataset is documented inbuild_nasadem_sd.ipynb
Infrastructure#
To create a container that includes h3 save the following on a file called Dockerfile:
FROM darribas/gds_dev:4.1
USER root
RUN wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | sudo apt-key add - \
&& apt-add-repository 'deb https://apt.kitware.com/ubuntu/ bionic main' \
&& apt-get update \
&& apt-get install --yes cmake
USER $NB_UID
RUN pip install h3
And build the container by running the following from the same folder where the file is stored:
docker build -t gds_h3 .
When if finishes, you should be able to launch the gds_h3 container and run this notebook from there.
%matplotlib inline
from h3 import h3
import geopandas
from shapely.geometry import Polygon
import contextily as ctx
import cenpy
Region to cover#
sd = geopandas.read_file("sd_city_centre.geojson")
ax = sd.plot(alpha=0.25, color="orange", figsize=(9, 9))
ctx.add_basemap(ax, crs=sd.crs.to_string())
Build H3 geography#
%time hexs = h3.polyfill(sd.geometry[0].__geo_interface__, 8, geo_json_conformant = True)
CPU times: user 8.19 ms, sys: 861 µs, total: 9.05 ms
Wall time: 10 ms
polygonise = lambda hex_id: Polygon(
h3.h3_to_geo_boundary(
hex_id, geo_json=True)
)
%time all_polys = geopandas.GeoSeries(list(map(polygonise, hexs)), \
index=hexs, \
crs="EPSG:4326" \
)
CPU times: user 101 ms, sys: 7.66 ms, total: 109 ms
Wall time: 109 ms
ax = all_polys.plot(alpha=0.5, color="xkcd:pale yellow", figsize=(9, 9))
ctx.add_basemap(ax, crs=all_polys.crs.to_string())
ax.set_title(f"{all_polys.shape[0]} Hexagons");
Clip ocean out#
There are quite a few hexagons that fall within the ocean. Let’s get rid of them. For that, we will pull down the polygon for San Diego county using cenpy:
census = cenpy.Decennial2010()
tracts = census.from_msa("San Diego, CA", level="tract")
/opt/conda/lib/python3.7/site-packages/pyproj/crs/crs.py:55: FutureWarning: '+init=<authority>:<code>' syntax is deprecated. '<authority>:<code>' is the preferred initialization method. When making the change, be mindful of axis order changes: https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6
return _prepare_from_string(" ".join(pjargs))
/opt/conda/lib/python3.7/site-packages/pyproj/crs/crs.py:55: FutureWarning: '+init=<authority>:<code>' syntax is deprecated. '<authority>:<code>' is the preferred initialization method. When making the change, be mindful of axis order changes: https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6
return _prepare_from_string(" ".join(pjargs))
Now these geometries include also some ocean:
ax = tracts.plot(alpha=0.5, color="xkcd:pale green")
ctx.add_basemap(ax, crs=tracts.crs.to_string())
It turns out that part is a separate polygon that may be removed easily:
tracts.head()
| GEOID | geometry | NAME | state | county | tract | |
|---|---|---|---|---|---|---|
| 0 | 06073018200 | POLYGON ((-13066573.290 3920769.610, -13066530... | Census Tract 182, San Diego County, California | 06 | 073 | 018200 |
| 1 | 06073018601 | POLYGON ((-13067719.770 3922939.420, -13067631... | Census Tract 186.01, San Diego County, California | 06 | 073 | 018601 |
| 2 | 06073018000 | POLYGON ((-13064524.900 3917063.650, -13064338... | Census Tract 180, San Diego County, California | 06 | 073 | 018000 |
| 3 | 06073017900 | POLYGON ((-13064042.110 3917657.290, -13064035... | Census Tract 179, San Diego County, California | 06 | 073 | 017900 |
| 4 | 06073018510 | POLYGON ((-13064367.720 3923018.450, -13064295... | Census Tract 185.10, San Diego County, California | 06 | 073 | 018510 |
tracts_land_one = geopandas.GeoDataFrame(
{"geometry": [tracts.query("GEOID != '06073990100'").unary_union],
"id": ["one"]
}, crs=tracts.crs)
ax = tracts_land_one.to_crs(epsg=4269).plot()
all_polys.to_crs(epsg=4269).plot(color="xkcd:pale yellow", ax=ax)
<matplotlib.axes._subplots.AxesSubplot at 0x7fdbf6eeb090>
Let’s convert all_polys into a GeoDataFrame:
h3_all = geopandas.GeoDataFrame({"geometry": all_polys,
"hex_id": all_polys.index},
crs=all_polys.crs
)
We’re an overlay away from our target:
%%time
h3_land = geopandas.overlay(h3_all,
tracts_land_one.to_crs(h3_all.crs),
how="intersection"
)
CPU times: user 3.43 s, sys: 43 ms, total: 3.47 s
Wall time: 3.46 s
ax = h3_land.plot(alpha=0.5, color="xkcd:pale yellow")
ctx.add_basemap(ax, crs=h3_land.crs.to_string())
Write out to GPKG#
! rm sd_h3_grid.gpkg
h3_land.drop("id", axis=1).to_file("sd_h3_grid.gpkg", driver="GPKG")
Tract Vs H3 geographies#
Get tracts only within the grid:
shp = geopandas.GeoDataFrame(
{"geometry": [h3_land.unary_union],
"id": ["one"]
}, crs = h3_land.crs).to_crs(tracts.crs)
sub = geopandas.overlay(tracts, shp, how="intersection")
Compare:
ax = sub.plot(facecolor="k", edgecolor="xkcd:lilac", figsize=(9, 9))
h3_land.to_crs(sub.crs).plot(facecolor="none", linewidth=0.5, edgecolor="xkcd:light aqua", ax=ax)
ctx.add_basemap(ax, crs=sub.crs.to_string())