30  A Stochastic Model of Land Use Change

30.0.1 OR – Tobler rides again: a nouveau computer movie simulating urban growth in detroit

Code
%load_ext watermark
%watermark -a 'eli knaap' -v -d -u -p geopandas,geosnap

Simulating land-use change in a cellular automata framework is one of the oldest traditions in urban growth modeling. Traditionally, a land-use model like SLEUTH uses a few different parameters calibrated on historical data to control the probability of a unit of land transitioning into a different kind of use (Brail, 2008; Dietzel & Clarke, 2006; Dietzel & Clarke, 2007; Guan & Clarke, 2010; Jantz et al., 2010).

Here, we’ll start with no data and build a simple CA model with a single transition rule using a spatial Markov framework. That is, the probability of a unit transitioning into different uses is a probabilistic function of its current state, and the states of the units around it. Using data from NLCD, we can observe how often these transitions occurred in the past, including how those transitions are conditioned by different spatial contexts. Then we can use that model to simulate conditions into the future.

30.1 Data

Code
import tobler
import geopandas as gpd
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from geosnap import DataStore
from geosnap.io import get_acs

Start by grabbing tract-level data for the Detroit-ish region

Code
datasets = DataStore()
/Users/knaaptime/Dropbox/projects/geosnap/geosnap/_data.py:66: UserWarning: The geosnap data storage class is provided for convenience only. The geosnap developers make no promises regarding data quality, consistency, or availability, nor are they responsible for any use/misuse of the data. The end-user is responsible for any and all analyses or applications created with the package.
  warn(
Code
counties = ['26163', '26099', '26125']
Code
tracts = get_acs(datasets, county_fips=counties, level="tract", years=[2018])
Code
tracts = tracts.to_crs(tracts.estimate_utm_crs())
Code
tracts.explore(tooltip=[])
Make this Notebook Trusted to load map: File -> Trust Notebook
Code
from tobler.util import h3fy

Now, create a regular hexagonal grid that covers the same area

Code
hexes = h3fy(tracts.to_crs(4326), resolution=9, clip=True)

Convert the land cover data from NLCD from a raster into a vector, collapsing contiguous cells into regions

Code
from tobler.dasymetric import extract_raster_features
Code
lulc_01 = extract_raster_features(
    hexes,
    "/Users/knaaptime/Dropbox/projects/geosnap_data/nlcd/landcover/nlcd_landcover_2001.tif",
)
lulc_01["value"] = lulc_01["value"].astype(int).astype(str)
Code
codes = legend = np.array(
    [0, 11, 12, 21, 22, 23, 24, 31, 41, 42, 43, 51, 52, 71, 72, 73, 74, 81, 82, 90, 95]
).astype(str)
name = np.array(
    [
        np.nan,
        "Open Water",
        "Perennial Ice/Snow",
        "Developed, Open Space",
        "Developed, Low Intensity",
        "Developed, Medium Intensity",
        "Developed High Intensity",
        "Barren Land (Rock/Sand/Clay)",
        "Deciduous Forest",
        "Evergreen Forest",
        "Mixed Forest",
        "Dwarf Scrub",
        "Shrub/Scrub",
        "Grassland/Herbaceous",
        "Sedge/Herbaceous",
        "Lichens",
        "Moss",
        "Pasture/Hay",
        "Cultivated Crops",
        "Woody Wetlands",
        "Emergent Herbaceous Wetlands",
    ]
)
Code
lu_mapper = dict(zip(codes, name))

Now aggregate the pixels into the (coarser) hexgrid and include a column of colors to match the conventions in land use mapping

Code
def img_to_hex(hexes, img_path, year):

    hexes = hexes.copy()

    # take the LULC image and convert from raster to vector
    lulc = extract_raster_features(hexes, img_path)

    # convert to a string for easier representation as categorical variable
    lulc["value"] = lulc["value"].astype(int).astype(str)

    # pixel value 255 is NA
    lulc["value"] = lulc["value"].replace({"255": np.nan})

    temp = gpd.sjoin(lulc.to_crs(hexes.crs), hexes)

    temp = (
        temp.groupby("index_right")["value"]
        .agg(lambda x: pd.Series.mode(x)[0])
        .astype(str)
    )  # cheat and take the min when multiple modes are returned...
    hexes = hexes.join(temp)
    hexes["year"] = year
    hexes["value"] = hexes["value"].replace(lu_mapper).replace({"nan": np.nan})
    hexes = hexes.dropna()

    return hexes
Code
from matplotlib.colors import to_hex
Code
nlcd_colors = pd.read_csv('nlcd_colors.csv',)
Code
# convert rgb to 0-1 range
nlcd_colors.color = nlcd_colors.color.apply(lambda x: np.array(x.split(",")).astype(float) / 255)

# convert rgb to hex (easier to hold a single value in a column instead of a list
nlcd_colors.color = nlcd_colors.color.apply(lambda x: to_hex(x))
Code
nlcd_colors
code color description
0 11 #466b9f Open Water
1 12 #d1def8 Perennial Snow/Ice
2 21 #dec5c5 Developed, Open Space
3 22 #d99282 Developed, Low Intensity
4 23 #eb0000 Developed, Medium Intensity
5 24 #ab0000 Developed High Intensity
6 31 #b3ac9f Barren Land (Rock/Sand/Clay)
7 41 #68ab5f Deciduous Forest
8 42 #1c5f2c Evergreen Forest
9 43 #b5c58f Mixed Forest
10 52 #ccb879 Shrub/Scrub
11 71 #dfdfc2 Grassland/Herbaceous
12 81 #dcd939 Pasture/Hay
13 82 #ab6c28 Cultivated Crops
14 90 #b8d9eb Woody Wetlands
15 95 #6c9fb8 Emergent Herbaceous Wetlands
Code
color_mapper = dict(zip(nlcd_colors.description.values.tolist(), nlcd_colors.color.values.tolist()))
Code
lulc_hex_01 = img_to_hex(
    hexes,
    "https://spatial-ucr.s3.amazonaws.com/nlcd/landcover/nlcd_landcover_2001.tif",
    2001,
)
Code
lulc_hex_01['color'] = lulc_hex_01.value.replace(color_mapper)
Code
#lulc_hex_01.explore(color=lulc_hex_01.color.values.tolist(), tooltip=['value'])
Code
nlcd_years = [2001, 2004, 2006, 2008, 2011, 2013, 2016]
Code
lulc_hexes = pd.concat(img_to_hex(
    hexes, f"/Users/knaaptime/Dropbox/projects/geosnap_data/nlcd/landcover/nlcd_landcover_{year}.tif",
    year) for year in nlcd_years)
Code
lulc_hexes = lulc_hexes.reset_index()
Code
lulc_hexes = lulc_hexes[lulc_hexes.value!='Shrub/Scrub']
Code
lulc_hexes = lulc_hexes[lulc_hexes.value!='Open Water']
Code
lulc_hexes['color'] = lulc_hexes.value.replace(color_mapper)
Code
from geosnap.visualize import animate_timeseries
Code
from IPython.display import Image

These are the empirical transitions encoded in the NLCD data between 2001 and 2016, represented at by a slightly larger hexgrid

Code
Image('lulc_dynamics.gif', embed=True, retina=True)
<IPython.core.display.Image object>

Observed Transitions

Observed Transitions

30.2 Modeling Land-Use Change

Code
from geosnap.analyze.dynamics import transition, predict_markov_labels, draw_sequence_from_gdf
Code
from geosnap.visualize import plot_transition_matrix

Using geosnap and PySAL, we can look at these transitions more formally, modeling the change over time as a spatial Markov process. Using this framework, we can ask whether certain transitions are more likely than others given their spatial context. For example, is a piece of undeveloped land more likely to become developed if it’s neighbors have already been developed?

Code
sm = transition(lulc_hexes, unit_index='hex_id', cluster_col='value')
/Users/knaaptime/Dropbox/projects/geosnap/geosnap/analyze/dynamics.py:123: FutureWarning: `use_index` defaults to False but will default to True in future. Set True/False directly to control this behavior and silence this warning
  w = Ws[w_type].from_dataframe(gpd.GeoDataFrame(gdf_wide), **w_options)
/Users/knaaptime/Dropbox/projects/libpysal/libpysal/weights/weights.py:224: UserWarning: The weights matrix is not fully connected: 
 There are 6 disconnected components.
 There are 2 islands with ids: 16374, 38660.
  warnings.warn(message)
('WARNING: ', 16374, ' is an island (no neighbors)')
('WARNING: ', 38660, ' is an island (no neighbors)')
/Users/knaaptime/mambaforge/envs/geosnap/lib/python3.9/site-packages/numpy/core/fromnumeric.py:3464: RuntimeWarning: Mean of empty slice.
  return _methods._mean(a, axis=axis, dtype=dtype,
Code
type(sm)
giddy.markov.Spatial_Markov

Inference from the spatial Markov object shows the transitions are not random, but show different rates depending on the “neighborhood” of each observation

Code
sm.Q_p_value
0.0
Code
sm.LR_p_value
0.0

Plotting all the transition matrices can be daunting, but it provides a detailed look at how transition rates vary by context

Code
plot_transition_matrix(transition_model=sm, figsize=(30,30), dpi=200)
array([<AxesSubplot:title={'center':'Global'}>,
       <AxesSubplot:title={'center':'Modal Neighbor - Barren Land (Rock/Sand/Clay)'}>,
       <AxesSubplot:title={'center':'Modal Neighbor - Cultivated Crops'}>,
       <AxesSubplot:title={'center':'Modal Neighbor - Deciduous Forest'}>,
       <AxesSubplot:title={'center':'Modal Neighbor - Developed High Intensity'}>,
       <AxesSubplot:title={'center':'Modal Neighbor - Developed, Low Intensity'}>,
       <AxesSubplot:title={'center':'Modal Neighbor - Developed, Medium Intensity'}>,
       <AxesSubplot:title={'center':'Modal Neighbor - Developed, Open Space'}>,
       <AxesSubplot:title={'center':'Modal Neighbor - Emergent Herbaceous Wetlands'}>,
       <AxesSubplot:title={'center':'Modal Neighbor - Evergreen Forest'}>,
       <AxesSubplot:title={'center':'Modal Neighbor - Grassland/Herbaceous'}>,
       <AxesSubplot:title={'center':'Modal Neighbor - Mixed Forest'}>,
       <AxesSubplot:title={'center':'Modal Neighbor - Pasture/Hay'}>,
       <AxesSubplot:title={'center':'Modal Neighbor - Woody Wetlands'}>,
       <AxesSubplot:>, <AxesSubplot:>], dtype=object)

We can see that barren land, for example, is much more likely to become developed when it’s neighbors are low-intensity developed (from top-left, see the second row, second column)

30.3 Simulating Future Land-Use

With these transition probabilities in hand (and the set of land uses in the last time period), we have all we need to simulate potential land uses in the future. First, we assume that the empirical transitions observed in prior time periods will remain static, and that only the prior time period influences the probability of transition between landuse types. Then, we can use the spatial context of each unit to select the proper transition matrix, and draw a new land-use type using the transition probabilities observed in the past. Repeating this process for each unit in the study area gives a stochastic realization of land-use transitions; repeating this process for a range of years lets us look at a potential evolution of the region into the future. We could then change transition probabilities or manually edit land-uses in certain places (e.g. to simulate processes like urban development, sea-level rise, or a protective land-use policy) and see how those changes propagate through the system into the future.

Code
from libpysal.weights import Queen, Rook, KNN
Code
w = Queen.from_dataframe(lulc_hexes[lulc_hexes.year==2016].dropna().reset_index())
FutureWarning: `use_index` defaults to False but will default to True in future. Set True/False directly to control this behavior and silence this warning
  w = Queen.from_dataframe(lulc_hexes[lulc_hexes.year==2016].dropna().reset_index())
/Users/knaaptime/Dropbox/projects/libpysal/libpysal/weights/weights.py:224: UserWarning: The weights matrix is not fully connected: 
 There are 5 disconnected components.
 There is 1 island with id: 34415.
  warnings.warn(message)
Code
temp_res = future_lulc.value.copy().values
Code
all(temp_res == future_lulc.value.values)
True
Code
future_lulc['temp'] = temp_res
Code
future_lulc[future_lulc.value!= temp_res][future_lulc[future_lulc.value!= temp_res].year==2019]
value year geometry temp
Code
future_lulc
value year geometry temp
284778 Emergent Herbaceous Wetlands 2016 POLYGON ((-83.18994 42.03666, -83.18952 42.034... Emergent Herbaceous Wetlands
284779 Emergent Herbaceous Wetlands 2016 POLYGON ((-83.18841 42.03954, -83.18799 42.037... Emergent Herbaceous Wetlands
284780 Emergent Herbaceous Wetlands 2016 POLYGON ((-83.19121 42.04195, -83.19079 42.040... Emergent Herbaceous Wetlands
284781 Woody Wetlands 2016 POLYGON ((-83.18688 42.04242, -83.18646 42.040... Woody Wetlands
284782 Developed, Low Intensity 2016 POLYGON ((-83.20418 42.04056, -83.20223 42.041... Developed, Low Intensity
... ... ... ... ...
332261 Developed, Open Space 2091 POLYGON ((-82.79925 42.89390, -82.79884 42.892... Developed, Open Space
332262 Pasture/Hay 2091 POLYGON ((-82.81532 42.89498, -82.81491 42.893... Pasture/Hay
332263 Deciduous Forest 2091 POLYGON ((-82.81091 42.89543, -82.81049 42.893... Deciduous Forest
332264 Developed, Low Intensity 2091 POLYGON ((-82.80608 42.89410, -82.80808 42.892... Developed, Low Intensity
332265 Developed, Low Intensity 2091 POLYGON ((-82.80167 42.89456, -82.80367 42.893... Developed, Low Intensity

1234688 rows × 4 columns

Code
import proplot # change aesthetics
Code
# add color column  based on LU type

future_lulc['color'] = future_lulc.value.replace(color_mapper)
Code
future_lulc.head()
value year geometry temp color
284778 Emergent Herbaceous Wetlands 2016 POLYGON ((-83.18994 42.03666, -83.18952 42.034... Emergent Herbaceous Wetlands #6c9fb8
284779 Emergent Herbaceous Wetlands 2016 POLYGON ((-83.18841 42.03954, -83.18799 42.037... Emergent Herbaceous Wetlands #6c9fb8
284780 Emergent Herbaceous Wetlands 2016 POLYGON ((-83.19121 42.04195, -83.19079 42.040... Emergent Herbaceous Wetlands #6c9fb8
284781 Woody Wetlands 2016 POLYGON ((-83.18688 42.04242, -83.18646 42.040... Woody Wetlands #b8d9eb
284782 Developed, Low Intensity 2016 POLYGON ((-83.20418 42.04056, -83.20223 42.041... Developed, Low Intensity #d99282

Now we can plot the simulated land-use moving into the future. This model is pretty naive, but the results seem plausible, with development growing outward from existing cores. We also get lots of infill (but the model isnt smart enough to know some places can’t be developed, etc). It would be fairly easy to include some additional restrictions that some observations are fixed because of policy or topology

Simulated Land Use

Simulated Land Use
Code
Image("predicted_LULC.gif", embed=True, retina=True)
<IPython.core.display.Image object>

We can also quickly compute the share of land in each type and compare two time periods. Since the output is a regular grid, we just need to count the number of cells in each category without needing to actually look at the area

30.3.0.1 2019

Code
future_lulc[future_lulc.year==2019].value.value_counts(normalize=True) 
Developed, Low Intensity        0.407871
Developed, Open Space           0.190448
Developed, Medium Intensity     0.155113
Deciduous Forest                0.103837
Woody Wetlands                  0.034451
Pasture/Hay                     0.033461
Mixed Forest                    0.024848
Cultivated Crops                0.021058
Emergent Herbaceous Wetlands    0.009645
Developed High Intensity        0.008571
Grassland/Herbaceous            0.005770
Barren Land (Rock/Sand/Clay)    0.003559
Evergreen Forest                0.001369
Name: value, dtype: float64

30.3.0.2 2091

Code
future_lulc[future_lulc.year==2091].value.value_counts(normalize=True) 
Developed, Low Intensity        0.404923
Developed, Medium Intensity     0.204494
Developed, Open Space           0.191480
Deciduous Forest                0.079325
Pasture/Hay                     0.027460
Woody Wetlands                  0.025606
Mixed Forest                    0.022995
Developed High Intensity        0.014930
Cultivated Crops                0.013730
Emergent Herbaceous Wetlands    0.007497
Grassland/Herbaceous            0.004254
Barren Land (Rock/Sand/Clay)    0.002064
Evergreen Forest                0.001242
Name: value, dtype: float64

If we take the future minus the past, we can look at relative change

Code
future_lulc[future_lulc.year==2091].value.value_counts(normalize=True) -  future_lulc[future_lulc.year==2019].value.value_counts(normalize=True) 
Barren Land (Rock/Sand/Clay)   -0.001495
Cultivated Crops               -0.007328
Deciduous Forest               -0.024511
Developed High Intensity        0.006360
Developed, Low Intensity       -0.002948
Developed, Medium Intensity     0.049381
Developed, Open Space           0.001032
Emergent Herbaceous Wetlands   -0.002148
Evergreen Forest               -0.000126
Grassland/Herbaceous           -0.001516
Mixed Forest                   -0.001853
Pasture/Hay                    -0.006002
Woody Wetlands                 -0.008844
Name: value, dtype: float64

This would suggest a continued trend toward urbanization, with particularly large relative losses in deciduous forest. Low-intensity development is decreasing, but medium-intensity seems to be growing strongly. Following this analysis, some natural questions that arise include, what would it mean (in social, economic, environmental, etc.) terms to have a 5% increase in medium-intensity development and a 2.5% reduction in deciduous forest? What are the implications for climate change, anticipated housing growth, traffic congestion, change in tax revenue, and so on? And how might these conditions differ under a different scenario, such as one where we have alternative land regulations or urban growth constraints?

These questions nudge us further in the direction of simulation modeling and scenario planning, which are exceedingly useful for medium and large-scale urban planning exercises (Avin, 2007; Avin & Goodspeed, 2020; Chakraborty et al., 2011; Chakraborty & McMillan, 2015; Knaap et al., 2020).

:::