Code
import geopandas as gpd
import osmnx as ox
import pandana as pdna
from geosnap import DataStore
from geosnap.analyze import isochrones_from_gdf
from geosnap.io import get_acs, get_network_from_gdf
import geopandas as gpd
import osmnx as ox
import pandana as pdna
from geosnap import DataStore
from geosnap.analyze import isochrones_from_gdf
from geosnap.io import get_acs, get_network_from_gdf
%load_ext watermark
%watermark -a 'eli knaap' -v -d -u -p geopandas,geosnap
%load_ext autoreload
%autoreload 2
Author: eli knaap
Last updated: 2024-03-11
Python implementation: CPython
Python version : 3.11.0
IPython version : 8.18.1
geopandas: 0.14.2
geosnap : 0.12.1.dev9+g3a1cb0f6de61.d20240110
Note this notebook requires osmnx
One way of thinking about isochrones is considering them as service areas. That is, given some travel budget (in time, distance, transit fare, etc), the isochrone represents the service area accessible within that budget.
Imagine we were interested in the service areas around social facilities in Syracuse
= DataStore() datasets
= get_acs(datasets, msa_fips="45060", years=[2018], level="tract") tracts
= ox.features.features_from_polygon(
facilities "amenity": "social_facility"}
tracts.unary_union, { )
= facilities[facilities.geometry.type == "Point"] facilities
= tracts.to_crs(tracts.estimate_utm_crs()) tracts
= facilities.to_crs(tracts.crs) facilities
=dict(radius=4)) facilities.explore(style_kwds
get_network_from_gdf?
Signature: get_network_from_gdf( gdf, network_type='walk', twoway=False, add_travel_times=False, default_speeds=None, ) Docstring: Create a pandana.Network object from a geodataframe (via OSMnx graph). Parameters ---------- gdf : geopandas.GeoDataFrame dataframe covering the study area of interest; Note the first step is to take the unary union of this dataframe, which is expensive, so large dataframes may be time-consuming. The network will inherit the CRS from this dataframe network_type : str, {"all_private", "all", "bike", "drive", "drive_service", "walk"} the type of network to collect from OSM (passed to `osmnx.graph_from_polygon`) by default "walk" twoway : bool, optional Whether to treat the pandana.Network as directed or undirected. For a directed network, use `twoway=False` (which is the default). For an undirected network (e.g. a walk network) where travel can flow in both directions, the network is more efficient when twoway=True but forces the impedance to be equal in both directions. This has implications for auto or multimodal networks where impedance is generally different depending on travel direction. add_travel_times : bool, default=False whether to use posted travel times from OSM as the impedance measure (rather than network-distance). Speeds are based on max posted drive speeds, see <https://osmnx.readthedocs.io/en/stable/internals-reference.html#osmnx-speed-module> for more information. default_speeds : dict, optional default speeds passed assumed when no data available on the OSM edge. Defaults to {"residential": 35, "secondary": 50, "tertiary": 60}. Only considered if add_travel_times is True Returns ------- pandana.Network a pandana.Network object with node coordinates stored in the same system as the input geodataframe. If add_travel_times is True, the network impedance is travel time measured in seconds (assuming automobile travel speeds); else the impedance is travel distance measured in meters Raises ------ ImportError requires `osmnx`, raises if module not available File: ~/Dropbox/projects/geosnap/geosnap/io/networkio.py Type: function
= get_network_from_gdf(tracts) walk_net
/Users/knaaptime/Dropbox/projects/geosnap/geosnap/io/networkio.py:71: UserWarning: GeoDataFrame is stored in coordinate system EPSG:32618 so the pandana.Network will also be stored in this system
warn(
Generating contraction hierarchies with 10 threads.
Setting CH node vector of size 86033
Setting CH edge vector of size 226066
Range graph removed 216738 edges of 452132
. 10% . 20% . 30% . 40% . 50% . 60% . 70% . 80% . 90% . 100%
isochrones_from_gdf?
Signature: isochrones_from_gdf( origins, threshold, network, network_crs=None, reindex=True, hull='shapely', ratio=0.2, allow_holes=False, use_edges=True, ) Docstring: Create travel isochrones for several origins simultaneously Parameters ---------- origins : geopandas.GeoDataFrame a geodataframe containing the locations of origin point features threshold: float maximum travel cost to define the isochrone, measured in the same impedance units as edges_df in the pandana.Network object. network : pandana.Network pandana Network instance for calculating the shortest path isochrone for each origin feature network_crs : str, int, pyproj.CRS (optional) the coordinate system used to store x and y coordinates in the passed pandana network. If None, the network is assumed to be stored in the same CRS as the origins geodataframe reindex : bool if True, use the dataframe index as the origin and destination IDs (rather than the node_ids of the pandana.Network). Default is True hull : str, {'libpysal', 'shapely'} Which method to generate container polygons (concave hulls) for destination points. If 'libpysal', use `libpysal.cg.alpha_shape_auto` to create the concave hull, else if 'shapely', use `shapely.concave_hull`. Default is libpysal ratio : float ratio keyword passed to `shapely.concave_hull`. Only used if `hull='shapely'`. Default is 0.3 allow_holes : bool keyword passed to `shapely.concave_hull` governing whether holes are allowed in the resulting polygon. Only used if `hull='shapely'`. Default is False. use_edges: bool If true, use vertices from the Network.edge_df to make the polygon more accurate by adhering to roadways. Requires that the 'geometry' column be available on the Network.edges_df, most commonly by using `geosnap.io.get_network_from_gdf` Returns ------- GeoPandas.DataFrame polygon geometries with the isochrones for each origin point feature File: ~/Dropbox/projects/geosnap/geosnap/analyze/network.py Type: function
To create the service area, we need to bound the set of reachable intersections using some kind of polygon. The resolution of the service area is dependent on the resolution of the network (i.e. since geosnap and pandana do not interpolate along the road network, greater intersection density will result in a more well-defined polygon).
There are different bounding-polygon algorithms to choose from. The default is shapely’s concave_hull
implementation, with the alpha_shape_auto algorithm from libpysal available as an alternative
= isochrones_from_gdf(
alpha =walk_net, threshold=2000, hull="libpysal"
facilities, network )
The alpha shape version of the concave hull is the most resource intensive because it tries to optimize the alpha parameter. This also makes it the slowest.
= alpha.explore()
m =m, color="black", style_kwds=dict(radius=3)) facilities.explore(m
= isochrones_from_gdf(
ch01 =walk_net, threshold=2000, hull="shapely", ratio=0.1
facilities, network )
The concave hull algorithm in shapely does not do automatic optimization, but requires setting a ratio
parameter, with smaller values resulting in tighter bounding polygons. If the ratio parameter is set too low, the bounding polygon can be too tight
= ch01.explore()
m =m, color="black", style_kwds=dict(radius=3)) facilities.explore(m
= isochrones_from_gdf(
ch02 =walk_net, threshold=2000, hull="shapely", ratio=0.2
facilities, network )
= ch02.explore()
m =m, color="black", style_kwds=dict(radius=3)) facilities.explore(m
Be very careful with driving times… There are lots of edges with no speed information (requiring us to make strong assumptions), and even at best, these represent free-flow conditions.
To get an automobile network, change network_type='drive'
and add_travel_times=True
= get_network_from_gdf(tracts, network_type="drive", add_travel_times=True) drive_net
/Users/knaaptime/Dropbox/projects/geosnap/geosnap/io/networkio.py:71: UserWarning: GeoDataFrame is stored in coordinate system EPSG:32618 so the pandana.Network will also be stored in this system
warn(
Generating contraction hierarchies with 10 threads.
Setting CH node vector of size 24608
Setting CH edge vector of size 65928
Range graph removed 65088 edges of 131856
. 10% . 20% . 30% . 40% . 50% . 60% . 70% . 80% . 90% . 100%
When using travel-time based impedance, travel times are measured in seconds
# 10 minute drive-shed
= isochrones_from_gdf(
drive_chrone =drive_net, threshold=600, ratio=0.1, allow_holes=False
facilities, network )
# look at only the first record
= drive_chrone.iloc[[1]].explore()
m 1]].explore(m=m, color="black", style_kwds=dict(radius=6)) facilities.iloc[[
/Users/knaaptime/mambaforge/envs/geosnap/lib/python3.11/site-packages/pyproj/transformer.py:820: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
return self._transformer._transform_point(
drive_net.edges_df
from | to | travel_time | geometry | |
---|---|---|---|---|
0 | 211905525 | 212550604 | 40.6 | LINESTRING (451007.906 4731874.375, 450991.739... |
1 | 211905525 | 6291988938 | 166.9 | LINESTRING (451007.906 4731874.375, 450989.152... |
2 | 212028019 | 212827209 | 55.3 | LINESTRING (426356.190 4737964.394, 426356.358... |
3 | 212533433 | 212533484 | 4.4 | LINESTRING (429887.965 4753965.714, 429886.570... |
4 | 212533433 | 212533444 | 12.6 | LINESTRING (429887.965 4753965.714, 429814.773... |
... | ... | ... | ... | ... |
65923 | 10917595296 | 213148992 | 19.6 | LINESTRING (404376.996 4832069.741, 404376.929... |
65924 | 10942234832 | 213009542 | 24.0 | LINESTRING (397377.555 4819328.544, 397372.182... |
65925 | 10942234832 | 213045439 | 4.3 | LINESTRING (397377.555 4819328.544, 397381.484... |
65926 | 10942234832 | 213177620 | 34.8 | LINESTRING (397377.555 4819328.544, 397373.513... |
65927 | 11012738718 | 212991426 | 4.3 | LINESTRING (378806.020 4814057.760, 378820.010... |
65928 rows × 4 columns
Remember also this is directed travel. These isochrones represent the area reachable from each social service provider; they do not necesssarily represent the origins who can reach the provider within a 10 minute drive. That is, drive networks are directed (and usually asymmetric).
This notebook only runs with routingpy
, which is incompatible with shapely 2 at the moment
Here, I’ve setup an instance of the valhalla routing server on a cloud server. It’s actually pretty easy–everything is in docker, but you won’t be able to follow along unless you setup your own server and edit the URL below accordingly
import geopandas as gpd
import pandas as pd
from geosnap import DataStore
from geosnap import analyze as gaz
from geosnap import io as gio
from routingpy import Valhalla
from shapely.geometry import Polygon
= DataStore() datasets
--------------------------------------------------------------------------- ModuleNotFoundError Traceback (most recent call last) Cell In[1], line 6 4 from geosnap import analyze as gaz 5 from geosnap import io as gio ----> 6 from routingpy import Valhalla 7 from shapely.geometry import Polygon 10 datasets = DataStore() ModuleNotFoundError: No module named 'routingpy'
Often, you need routing, or travel time matrices, or isochrones for a place, but commercial or cloud-based solutions are not an option. For example if:
All of those cases require more control over the routing than web services can provide, which leaves two strong options for local computation.
= gio.get_acs(datasets, county_fips='06073', years=[2019], level='tract')
sd = sd[sd.n_total_pop >0] sd
sd.plot()
<AxesSubplot:>
# grab 10 observations
= sd.head(10) sdsub
= list(zip(sdsub.geometry.centroid.get_coordinates().x.tolist(), sdsub.geometry.centroid.y)) coords
UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.
coords = list(zip(sdsub.geometry.centroid.get_coordinates().x.tolist(), sdsub.geometry.centroid.y))
<ipython-input-5-9c471eba5560>:1: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.
coords = list(zip(sdsub.geometry.centroid.get_coordinates().x.tolist(), sdsub.geometry.centroid.y))
One is to setup a locally-hosted open-source routing engine and query that server instead of a commercial server (as you would do with a geocoding service). This is nice because they are highly performant, but the drawbacksa are: - can be a little difficult to setup and maintain, or ingest new data - can take a ton of compute/storage/memory resources, so may require your own “server” - can make the work unreproducible in many contexts, because anyone running the code also needs access to the local server. We use tailscale to share compute resources, but that adds a small layer of complexity
# no trailing slash, api_key is not necessary
# you need to be connected to tailscale for this to work
= Valhalla(base_url="http://cogs-research:8002", timeout=6000) client
client.isochrones.ex
Init signature: client.Waypoint(position, **kwargs) Docstring: Constructs a waypoint with additional information or constraints. Refer to Valhalla's documentation for details: https://github.com/valhalla/valhalla/blob/master/docs/api/turn-by-turn/api-reference.md#locations Use ``kwargs`` to spedify options, make sure the value is proper for each option. Example: >>> waypoint = Valhalla.WayPoint(position=[8.15315, 52.53151], type='through', heading=120, heading_tolerance=10, minimum_reachability=10, radius=400) >>> route = Valhalla('http://localhost').directions(locations=[[[8.58232, 51.57234]], waypoint, [7.15315, 53.632415]]) File: ~/mambaforge/envs/geosnap/lib/python3.9/site-packages/routingpy/routers/valhalla.py Type: type Subclasses:
# valhalla can do one origin at a time, so only pass the first
= client.isochrones(coords[0], 'pedestrian', intervals=[1000,1500,2000,2500], interval_type='distance') test_isos
# turn the json response into gdfs
def gdf_from_routingpy(isos):
= [Polygon(i['geometry']['coordinates']) for i in isos.raw['features']]
polys = gpd.GeoDataFrame(geometry=polys, crs=4326)
gdf return gdf
= gdf_from_routingpy(test_isos) gdf
gdf.explore()
import pandana as pdna
= pdna.Network.from_hdf5("/Users/knaaptime/Dropbox/projects/geosnap_data/metro_networks_8k/41740.h5") sdnet
= pd.concat([gaz.isochrones(gpd.GeoDataFrame(sdsub.iloc[0].to_frame().T), threshold=t, network=sdnet) for t in [1000,1500,2000,2500]]) gisos
gisos.explore()
gdf.explore()
The isochrones are pretty consistent, so i think the differences between the two are:
= gdf.head(1).explore() m
'distance']==2500].explore(m=m) gisos[gisos[
0]**2 sd.shape[
391876
client.matrix?
Signature: client.matrix( locations, profile, sources=None, destinations=None, preference=None, options=None, avoid_locations=None, avoid_polygons=None, units=None, id=None, dry_run=None, ) Docstring: Gets travel distance and time for a matrix of origins and destinations. For more information, visit https://github.com/valhalla/valhalla/blob/master/docs/api/matrix/api-reference.md. Use ``kwargs`` for any missing ``matrix`` request options. :param locations: Multiple pairs of lng/lat values. :type locations: list of list :param profile: Specifies the mode of transport to use when calculating matrices. One of ["auto", "bicycle", "multimodal", "pedestrian". :type profile: str :param sources: A list of indices that refer to the list of locations (starting with 0). If not passed, all indices are considered. :type sources: list of int :param destinations: A list of indices that refer to the list of locations (starting with 0). If not passed, all indices are considered. :type destinations: list of int :param str preference: Convenience argument to set the cost metric, one of ['shortest', 'fastest']. Note, that shortest is not guaranteed to be absolute shortest for motor vehicle profiles. It's called ``preference`` to be inline with the already existing parameter in the ORS adapter. :param options: Profiles can have several options that can be adjusted to develop the route path, as well as for estimating time along the path. Only specify the actual options dict, the profile will be filled automatically. For more information, visit: https://github.com/valhalla/valhalla/blob/master/docs/api/turn-by-turn/api-reference.md#costing-options :type options: dict :param avoid_locations: A set of locations to exclude or avoid within a route. Specified as a list of coordinates, similar to coordinates object. :type avoid_locations: list of list :param List[List[List[float]]] avoid_polygons: One or multiple exterior rings of polygons in the form of nested JSON arrays, e.g. [[[lon1, lat1], [lon2,lat2]],[[lon1,lat1],[lon2,lat2]]]. Roads intersecting these rings will be avoided during path finding. If you only need to avoid a few specific roads, it's much more efficient to use avoid_locations. Valhalla will close open rings (i.e. copy the first coordingate to the last position). :param units: Distance units for output. One of ['mi', 'km']. Default km. :type units: str :param id: Name your route request. If id is specified, the naming will be sent through to the response. :type id: str :param dry_run: Print URL and parameters without sending the request. :param dry_run: bool :returns: A matrix from the specified sources and destinations. :rtype: :class:`routingpy.matrix.Matrix` File: ~/mambaforge/envs/geosnap/lib/python3.9/site-packages/routingpy/routers/valhalla.py Type: method
# pairwise matrix for all tract centroids in SD county
= list(zip(sd.geometry.centroid.get_coordinates().x.tolist(), sd.geometry.centroid.y)) coords
UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.
coords = list(zip(sd.geometry.centroid.get_coordinates().x.tolist(), sd.geometry.centroid.y))
<ipython-input-17-e7ac3013bf3d>:2: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.
coords = list(zip(sd.geometry.centroid.get_coordinates().x.tolist(), sd.geometry.centroid.y))
:::