import contextily as ctximport geopandas as gpdimport matplotlib.pyplot as pltimport pandana as pdnaimport pandas as pdfrom geosnap import DataStorefrom geosnap import analyze as gazfrom geosnap import io as gio%load_ext watermark%watermark -a 'eli knaap'-iv
OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.
/Users/knaaptime/Dropbox/projects/geosnap/geosnap/io/util.py:275: UserWarning: Unable to find local adjustment year for 2021. Attempting from online data
warn(
/Users/knaaptime/Dropbox/projects/geosnap/geosnap/io/constructors.py:215: UserWarning: Currency columns unavailable at this resolution; not adjusting for inflation
warn(
get the intersection IDs nearest to the centroid of each census block (mapping the job data to the travel network)
save these IDs as a variable on the input dataframe so we can use them as a join index
Code
sd_lodes["node_ids"] = sd_network.get_node_ids( sd_lodes.centroid.geometry.x, sd_lodes.centroid.geometry.y)# get the node ids for tracts (for visualizationsd_acs["node_ids"] = sd_network.get_node_ids( sd_acs.centroid.geometry.x, sd_acs.centroid.geometry.y)
/var/folders/79/cknfb1sx2pv16rztkpg6wzlw0000gn/T/ipykernel_1937/430745399.py:2: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.
sd_lodes.centroid.geometry.x, sd_lodes.centroid.geometry.y
/var/folders/79/cknfb1sx2pv16rztkpg6wzlw0000gn/T/ipykernel_1937/430745399.py:7: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.
sd_acs.centroid.geometry.x, sd_acs.centroid.geometry.y
/Users/knaaptime/mambaforge/envs/urban_analysis/lib/python3.10/site-packages/geopandas/geodataframe.py:1528: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
super().__setitem__(key, value)
Code
subcenters = subcenters.dissolve("center")
Code
subcenters.explore(tiles="CartoDB Positron")
Make this Notebook Trusted to load map: File -> Trust Notebook
18.2.2 A Detailed Approach
Since we’re operating on census tracts, our resulting subcenters are a function of tract size and contiguity, but the data are actually much higher resolution than that. We have computed accessibility at the intersection level, rather than the tract level, so we could instead look for significant \(G_i\ast\) clusters of these intersections instead (then wrap a polygon around them). This is the approach taken in Knaap and Rey (forthcoming)
With the new libpysal Graph, we can also use network distances to create our weights object, rather than relying on contiguity between polygons
/var/folders/79/cknfb1sx2pv16rztkpg6wzlw0000gn/T/ipykernel_1937/1609565934.py:1: UserWarning: CRS mismatch between the CRS of left geometries and the CRS of right geometries.
Use `to_crs()` to reproject one of the input geometries to match the CRS of the other.
Left CRS: EPSG:4326
Right CRS: EPSG:4269
all_intersections = gpd.overlay(
# only the nodes attached to a census blocklodes_nodes = all_intersections[all_intersections.index.isin(sd_lodes.node_ids)]lodes_nodes.plot("job_access", scheme="quantiles", k=10, alpha=0.4)
This is still extremely detailed, but we are not including additional geometries where we have no actual data points (i.e. our resolution is only as fine as the block-level where the jobs data come from)
Now we use the nodes_in_range function to return an adjacency list of nodes within two kilometers and convert that into a libpysal.Graph
/Users/knaaptime/mambaforge/envs/urban_analysis/lib/python3.10/site-packages/geopandas/geodataframe.py:1528: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
super().__setitem__(key, value)
/Users/knaaptime/mambaforge/envs/urban_analysis/lib/python3.10/site-packages/geopandas/geodataframe.py:1528: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
super().__setitem__(key, value)
Make this Notebook Trusted to load map: File -> Trust Notebook
Moving to the intersection scale results in a much larger employment center footprint, because there are many more observations, the surface is much more distinct, and its easier to pick out spatial outliers. The difficulty is that now our dataset is represented as a collection of points, when we want to identify discrete center polygons.
To do that, we first define a range wherein points are considered to belong to the same center (instead of setting this with a discrete threshold rule, you could alternatively use something like DBscan). Then, for each set of points, we wrap a tightly-fitted polygon that represents the “center”.
for ratio in [.05, .1, .2]: hulls[f'ch {ratio}'] = concave_hull(MultiPoint(center_nodes[center_nodes.center ==1].geometry.tolist()), ratio, allow_holes=False,)
Employment Centers are identified as areas with a high concentration of employment based on SANDAG’s 2016 employment inventory from Urbansim summarized to quarter-mile radius hexagons. Local-maxima were identified as starting points, and regions were grown to include neighboring hexagons meeting a minimum employment density threshold within an approximate 2-mile radius. The resulting boundaries were generalized (taking into account major barrier features such as topography and freeways) and used to select SANDAG Master Geographic Reference Areas (MGRAs) by activity-weighted (population and employment) centroid. Employment centers are ranked by Tier based on total number of jobs:
Tier 1: 50,000 or more jobs
Tier 2: 25,000 to 49,999 jobs
Tier 3: 15,000 to 24,999 jobs
Tier 4: 2,500 to 14,999 jobs
Hexbin geography (where people live map layers) – generated using a quarter-mile radius hex bin grid overlay on the region. Data is summarized by hex bin spatially using the centroid of an overlying geography.
Where people live by Employment Center they Work In –2015 LEHD census block level data; converted to hexbin (see hexbin geography). LEHD data was used as it contains information on both the home and work location. Data represents an employee residence by the employment center they work in. https://lehd.ces.census.gov/data/.
Make this Notebook Trusted to load map: File -> Trust Notebook
Although the extent of the polygons differ, the accessibility method agrees with the SANDAG employment centers for all centers in Tiers 1-3. With the exception of Carlsbad, Oceanside, and Encinitas, most of SANDAG’s tier 4 centers fail to meet either the density or significant threshold (often both)
In the map below, the access-based employment centers are represented using intersecting census block geometries
Code
m = sandag.explore(tiles="CartoDB Positron")gpd.sjoin(sd_lodes, node_subcenters).dissolve("index_right").explore(color="red", m=m)
Make this Notebook Trusted to load map: File -> Trust Notebook
ignoring the G statistic and looking only at nodes with job access in the top quartile
Code
m = sandag.explore(tiles="CartoDB Positron")lodes_nodes[lodes_nodes.job_rank >0.75].explore(m=m, color="red")
Make this Notebook Trusted to load map: File -> Trust Notebook
When employment accessibility, alone, is the selection criteria (as in the map above), using nodes in the top quintile usually finds the remaining Tier 4 centers, except for some exurban places like Ramona and Otay Mesa
If you really want to be generous, just take the top half of the access distribution
Code
m = sandag.explore(tiles="CartoDB Positron")lodes_nodes[lodes_nodes.job_rank >0.5].explore(m=m, color="red")
Make this Notebook Trusted to load map: File -> Trust Notebook
With this simple rule, we cover all the areas defined by SANDAG, albeit with a much higher resolution
Cervero, R., & Wu, K.-L. (1998). Sub-centring and Commuting: Evidence from the San Francisco Bay Area, 1980-90. Urban Studies, 35(7), 1059–1076. https://doi.org/10.1080/0042098984484
Giuliano, G., & A. Small, K. (1999). The determinants of growth of employment subcenters. Journal of Transport Geography, 7(3), 189–201. https://doi.org/10.1016/S0966-6923(98)00043-X
Giuliano, G., Redfearn, C., Agarwal, A., & He, S. (2012). Network Accessibility and Employment Centres. Urban Studies, 49(7), 77–95. https://doi.org/10.1177/0042098011411948
Giuliano, G., Redfearn, C., Agarwal, A., Li, C., & Zhuang, D. (2007). Employment concentrations in Los Angeles, 1980-2000. Environment and Planning A, 39(12), 2935–2957. https://doi.org/10.1068/a393
Gordon, P., & Richardson, H. W. (1996). Employment decentralization in US metropolitan areas: Is Los Angeles an outlier or the norm? Environment and Planning A, 28(10), 1727–1743. https://doi.org/10.1068/a281727
McDonald, J. F., & McMillen, D. P. (2000). Employment Subcenters and Subsequent Real Estate Development in Suburban Chicago. Journal of Urban Economics, 48(1), 135–157. https://doi.org/10.1006/juec.1999.2160