[[26]{.chapter-number}  [Stochastic Cellular Automata]{.chapter-title}]{#sec-lulc .quarto-section-identifier}

26 Stochastic Cellular Automata

Code

import tobler
import geopandas as gpd
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from geosnap import DataStore
from geosnap.analyze.dynamics import transition, predict_markov_labels, draw_sequence_from_gdf
from geosnap.io import get_acs
from geosnap.visualize import animate_timeseries, plot_transition_matrix
from IPython.display import Image
from libpysal.weights import Queen, Rook, KNN
from matplotlib.colors import to_hex
from tobler.util import h3fy
from tobler.dasymetric import extract_raster_features

%load_ext watermark
%watermark -a 'eli knaap' -v -d -u -p geopandas,geosnap

OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.

Author: eli knaap

Last updated: 2025-04-08

Python implementation: CPython
Python version       : 3.12.8
IPython version      : 8.31.0

geopandas: 1.0.1
geosnap  : 0.14.1.dev14+g0443e2a.d20250103

Or, Tobler rides again: a nouveau computer movie simulating urban growth in detroit

Simulating land-use change in a Cellular Automata (CA) framework is one of the oldest traditions in urban growth modeling. Traditionally, a land-use model like SLEUTH uses a few different parameters calibrated on historical data to control the probability of a unit of land transitioning into a different kind of use (Brail, 2008; Dietzel & Clarke, 2006; Dietzel & Clarke, 2007; Guan & Clarke, 2010; Jantz et al., 2010).

Here, we’ll start with no data and build a simple CA model with a single transition rule using a spatial Markov framework. That is, the probability of a unit transitioning into different uses is a probabilistic function of its current state, and the states of the units around it. Using data from NLCD, we can observe how often these transitions occurred in the past, including how those transitions are conditioned by different spatial contexts. Then we can use that model to simulate conditions into the future.

26.1 Data

Start by grabbing tract-level data for the Detroit-ish region

Code

datasets = DataStore()
counties = ['26163', '26099', '26125']
tracts = get_acs(datasets, county_fips=counties, level="tract", years=[2018])
tracts = tracts.to_crs(tracts.estimate_utm_crs())
tracts.explore(tooltip=[])

Make this Notebook Trusted to load map: File -> Trust Notebook

Now, create a regular hexagonal grid that covers the same area and convert the land cover data from NLCD from a raster into a vector, collapsing contiguous cells into regions

Code

hexes = h3fy(tracts.to_crs(4326), resolution=9, clip=True)
lulc_01 = extract_raster_features(
    hexes,
    "/Users/knaaptime/Dropbox/projects/geosnap_data/nlcd/landcover/nlcd_landcover_2001.tif",
)
lulc_01["value"] = lulc_01["value"].astype(int).astype(str)

Code

codes = legend = np.array(
    [0, 11, 12, 21, 22, 23, 24, 31, 41, 42, 43, 51, 52, 71, 72, 73, 74, 81, 82, 90, 95]
).astype(str)
name = np.array(
    [
        np.nan,
        "Open Water",
        "Perennial Ice/Snow",
        "Developed, Open Space",
        "Developed, Low Intensity",
        "Developed, Medium Intensity",
        "Developed High Intensity",
        "Barren Land (Rock/Sand/Clay)",
        "Deciduous Forest",
        "Evergreen Forest",
        "Mixed Forest",
        "Dwarf Scrub",
        "Shrub/Scrub",
        "Grassland/Herbaceous",
        "Sedge/Herbaceous",
        "Lichens",
        "Moss",
        "Pasture/Hay",
        "Cultivated Crops",
        "Woody Wetlands",
        "Emergent Herbaceous Wetlands",
    ]
)

Code

lu_mapper = dict(zip(codes, name))

Now aggregate the pixels into the (coarser) hexgrid and include a column of colors to match the conventions in land use mapping

Code

def img_to_hex(hexes, img_path, year):

    hexes = hexes.copy()
    # take the LULC image and convert from raster to vector
    lulc = extract_raster_features(hexes, img_path)
    # convert to a string for easier representation as categorical variable
    lulc["value"] = lulc["value"].astype(int).astype(str)
    # pixel value 255 is NA
    lulc["value"] = lulc["value"].replace({"255": np.nan})
    temp = gpd.sjoin(lulc.to_crs(hexes.crs), hexes)
    temp = (
        temp.groupby("hex_id")["value"]
        .agg(lambda x: pd.Series.mode(x)[0])
        .astype(str)
    )  # cheat and take the min when multiple modes are returned...
    hexes = hexes.join(temp)
    hexes["year"] = year
    hexes["value"] = hexes["value"].replace(lu_mapper).replace({"nan": np.nan})
    hexes = hexes.dropna()

    return hexes

Code

nlcd_colors = pd.read_csv('nlcd_colors.csv',)

Code

# convert rgb to 0-1 range
nlcd_colors.color = nlcd_colors.color.apply(lambda x: np.array(x.split(",")).astype(float) / 255)

# convert rgb to hex (easier to hold a single value in a column instead of a list
nlcd_colors.color = nlcd_colors.color.apply(lambda x: to_hex(x))

Code

nlcd_colors

	code	color	description
0	11	#466b9f	Open Water
1	12	#d1def8	Perennial Snow/Ice
2	21	#dec5c5	Developed, Open Space
3	22	#d99282	Developed, Low Intensity
4	23	#eb0000	Developed, Medium Intensity
5	24	#ab0000	Developed High Intensity
6	31	#b3ac9f	Barren Land (Rock/Sand/Clay)
7	41	#68ab5f	Deciduous Forest
8	42	#1c5f2c	Evergreen Forest
9	43	#b5c58f	Mixed Forest
10	52	#ccb879	Shrub/Scrub
11	71	#dfdfc2	Grassland/Herbaceous
12	81	#dcd939	Pasture/Hay
13	82	#ab6c28	Cultivated Crops
14	90	#b8d9eb	Woody Wetlands
15	95	#6c9fb8	Emergent Herbaceous Wetlands

Code

color_mapper = dict(zip(nlcd_colors.description.values.tolist(), nlcd_colors.color.values.tolist()))

Code

lulc_hex_01 = img_to_hex(
    hexes,
    "https://spatial-ucr.s3.amazonaws.com/nlcd/landcover/nlcd_landcover_2001.tif",
    2001,
)

Code

lulc_hex_01['color'] = lulc_hex_01.value.replace(color_mapper)

Code

nlcd_years = [2001, 2004, 2006, 2008, 2011, 2013, 2016]

lulc_hexes = pd.concat(img_to_hex(
    hexes, f"/Users/knaaptime/Dropbox/projects/geosnap_data/nlcd/landcover/nlcd_landcover_{year}.tif",
    year) for year in nlcd_years)
lulc_hexes = lulc_hexes.reset_index()

lulc_hexes = lulc_hexes[lulc_hexes.value!='Shrub/Scrub']
lulc_hexes = lulc_hexes[lulc_hexes.value!='Open Water']

lulc_hexes['color'] = lulc_hexes.value.replace(color_mapper)

Code

animate_timeseries(
    lulc_hexes,
    filename="lulc_dynamics.gif",
    figsize=(12, 12),
    fps=2,
    dpi=100,
    color_col='color',
)

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 960x960 with 0 Axes>

These are the empirical transitions encoded in the NLCD data between 2001 and 2016, represented at by a slightly larger hexgrid

Code

Image('lulc_dynamics.gif', embed=True, retina=True)

<IPython.core.display.Image object>

26.2 Modeling Land-Use Change

Using geosnap and PySAL, we can look at these transitions more formally, modeling the change over time as a spatial Markov process. Using this framework, we can ask whether certain transitions are more likely than others given their spatial context. For example, is a piece of undeveloped land more likely to become developed if it’s neighbors have already been developed?

Code

sm = transition(lulc_hexes, unit_index='hex_id', cluster_col='value')

/Users/knaaptime/Dropbox/projects/geosnap/geosnap/analyze/dynamics.py:123: FutureWarning: `use_index` defaults to False but will default to True in future. Set True/False directly to control this behavior and silence this warning
  w = Ws[w_type].from_dataframe(gpd.GeoDataFrame(gdf_wide), **w_options)
/Users/knaaptime/miniforge3/envs/urban_analysis/lib/python3.12/site-packages/libpysal/weights/contiguity.py:61: UserWarning: The weights matrix is not fully connected: 
 There are 6 disconnected components.
 There are 2 islands with ids: 16374, 38660.
  W.__init__(self, neighbors, ids=ids, **kw)

('WARNING: ', 16374, ' is an island (no neighbors)')
('WARNING: ', 38660, ' is an island (no neighbors)')

/Users/knaaptime/miniforge3/envs/urban_analysis/lib/python3.12/site-packages/numpy/core/fromnumeric.py:3504: RuntimeWarning: Mean of empty slice.
  return _methods._mean(a, axis=axis, dtype=dtype,

Code

type(sm)

giddy.markov.Spatial_Markov

Inference from the spatial Markov object shows the transitions are not random, but show different rates depending on the “neighborhood” of each observation

Code

sm.Q_p_value

0.0

Code

sm.LR_p_value

0.0

Plotting all the transition matrices can be daunting, but it provides a detailed look at how transition rates vary by context

Code

plot_transition_matrix(transition_model=sm, figsize=(30,30), dpi=200)

array([<Axes: title={'center': 'Global'}>,
       <Axes: title={'center': 'Modal Neighbor - Barren Land (Rock/Sand/Clay)'}>,
       <Axes: title={'center': 'Modal Neighbor - Cultivated Crops'}>,
       <Axes: title={'center': 'Modal Neighbor - Deciduous Forest'}>,
       <Axes: title={'center': 'Modal Neighbor - Developed High Intensity'}>,
       <Axes: title={'center': 'Modal Neighbor - Developed, Low Intensity'}>,
       <Axes: title={'center': 'Modal Neighbor - Developed, Medium Intensity'}>,
       <Axes: title={'center': 'Modal Neighbor - Developed, Open Space'}>,
       <Axes: title={'center': 'Modal Neighbor - Emergent Herbaceous Wetlands'}>,
       <Axes: title={'center': 'Modal Neighbor - Evergreen Forest'}>,
       <Axes: title={'center': 'Modal Neighbor - Grassland/Herbaceous'}>,
       <Axes: title={'center': 'Modal Neighbor - Mixed Forest'}>,
       <Axes: title={'center': 'Modal Neighbor - Pasture/Hay'}>,
       <Axes: title={'center': 'Modal Neighbor - Woody Wetlands'}>,
       <Axes: >, <Axes: >], dtype=object)

We can see that barren land, for example, is much more likely to become developed when it’s neighbors are low-intensity developed (from top-left, see the second row, second column)

26.3 Simulating Future Land-Use

With these transition probabilities in hand (and the set of land uses in the last time period), we have all we need to simulate potential land uses in the future. First, we assume that the empirical transitions observed in prior time periods will remain static, and that only the prior time period influences the probability of transition between landuse types. Then, we can use the spatial context of each unit to select the proper transition matrix, and draw a new land-use type using the transition probabilities observed in the past.

Repeating this process for each unit in the study area gives a stochastic realization of land-use transitions; repeating this process for a range of years lets us look at a potential evolution of the region into the future. We could then change transition probabilities or manually edit land-uses in certain places (e.g. to simulate processes like urban development, sea-level rise, or a protective land-use policy) and see how those changes propagate through the system into the future.

Code

w = Queen.from_dataframe(lulc_hexes[lulc_hexes.year==2016].dropna().reset_index())

/var/folders/j8/5bgcw6hs7cqcbbz48d6bsftw0000gp/T/ipykernel_82155/4138579200.py:1: FutureWarning: `use_index` defaults to False but will default to True in future. Set True/False directly to control this behavior and silence this warning
  w = Queen.from_dataframe(lulc_hexes[lulc_hexes.year==2016].dropna().reset_index())
/Users/knaaptime/miniforge3/envs/urban_analysis/lib/python3.12/site-packages/libpysal/weights/contiguity.py:347: UserWarning: The weights matrix is not fully connected: 
 There are 5 disconnected components.
 There is 1 island with id: 34415.
  W.__init__(self, neighbors, ids=ids, **kw)

Code

temp_res = future_lulc.value.copy().values

Code

all(temp_res == future_lulc.value.values)

True

Code

future_lulc['temp'] = temp_res

Code

future_lulc[future_lulc.value!= temp_res][future_lulc[future_lulc.value!= temp_res].year==2019]

future_lulc

	index	value	year	geometry	temp
0	284778	Emergent Herbaceous Wetlands	2016	POLYGON ((-83.18994 42.03666, -83.18952 42.034...	Emergent Herbaceous Wetlands
1	284779	Emergent Herbaceous Wetlands	2016	POLYGON ((-83.19079 42.04019, -83.18841 42.039...	Emergent Herbaceous Wetlands
2	284780	Emergent Herbaceous Wetlands	2016	POLYGON ((-83.19553 42.04149, -83.19358 42.042...	Emergent Herbaceous Wetlands
3	284781	Woody Wetlands	2016	POLYGON ((-83.18926 42.04307, -83.18688 42.042...	Woody Wetlands
4	284782	Developed, Low Intensity	2016	POLYGON ((-83.20418 42.04056, -83.20223 42.041...	Developed, Low Intensity
...	...	...	...	...	...
47483	332261	Developed, Open Space	2091	POLYGON ((-82.79925 42.8939, -82.79884 42.8921...	Developed, Open Space
47484	332262	Pasture/Hay	2091	POLYGON ((-82.81532 42.89498, -82.81491 42.893...	Pasture/Hay
47485	332263	Deciduous Forest	2091	POLYGON ((-82.81091 42.89543, -82.81049 42.893...	Deciduous Forest
47486	332264	Developed, Low Intensity	2091	POLYGON ((-82.80608 42.8941, -82.80808 42.8929...	Developed, Low Intensity
47487	332265	Developed, Low Intensity	2091	POLYGON ((-82.80167 42.89456, -82.80367 42.893...	Developed, Low Intensity

1234688 rows × 5 columns

Code

future_lulc['color'] = future_lulc.value.replace(color_mapper)

future_lulc.head()

	index	value	year	geometry	temp	color
0	284778	Emergent Herbaceous Wetlands	2016	POLYGON ((-83.18994 42.03666, -83.18952 42.034...	Emergent Herbaceous Wetlands	#6c9fb8
1	284779	Emergent Herbaceous Wetlands	2016	POLYGON ((-83.19079 42.04019, -83.18841 42.039...	Emergent Herbaceous Wetlands	#6c9fb8
2	284780	Emergent Herbaceous Wetlands	2016	POLYGON ((-83.19553 42.04149, -83.19358 42.042...	Emergent Herbaceous Wetlands	#6c9fb8
3	284781	Woody Wetlands	2016	POLYGON ((-83.18926 42.04307, -83.18688 42.042...	Woody Wetlands	#b8d9eb
4	284782	Developed, Low Intensity	2016	POLYGON ((-83.20418 42.04056, -83.20223 42.041...	Developed, Low Intensity	#d99282

Code

animate_timeseries(
    future_lulc,
    filename="predicted_LULC.gif",
    categorical=False,
    title="Simulated Land-Use",
    figsize=(12, 12),
    color_col='color',
    fps=3,
    dpi=150,
)

/Users/knaaptime/Dropbox/projects/geosnap/geosnap/visualize/mapping.py:374: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (`matplotlib.pyplot.figure`) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam `figure.max_open_warning`). Consider using `matplotlib.pyplot.close()`.
  fig, ax = plt.subplots(figsize=figsize)

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 1152x1152 with 0 Axes>

<Figure size 960x960 with 0 Axes>

Now we can plot the simulated land-use moving into the future. This model is pretty naive, but the results seem plausible, with development growing outward from existing cores. We also get lots of infill (but the model isnt smart enough to know some places can’t be developed, etc). It would be fairly easy to include some additional restrictions that some observations are fixed because of policy or topology

Code

Image("predicted_LULC.gif", embed=True, retina=True)

<IPython.core.display.Image object>

We can also quickly compute the share of land in each type and compare two time periods. Since the output is a regular grid, we just need to count the number of cells in each category without needing to actually look at the area

26.3.0.1 2019

Code

future_lulc[future_lulc.year==2019].value.value_counts(normalize=True)

value
Developed, Low Intensity        0.408061
Developed, Open Space           0.190322
Developed, Medium Intensity     0.155008
Deciduous Forest                0.103710
Woody Wetlands                  0.034472
Pasture/Hay                     0.033482
Mixed Forest                    0.024996
Cultivated Crops                0.021058
Emergent Herbaceous Wetlands    0.009666
Developed High Intensity        0.008528
Grassland/Herbaceous            0.005749
Barren Land (Rock/Sand/Clay)    0.003580
Evergreen Forest                0.001369
Name: proportion, dtype: float64

26.3.0.2 2091

Code

future_lulc[future_lulc.year==2091].value.value_counts(normalize=True)

value
Developed, Low Intensity        0.405366
Developed, Medium Intensity     0.204978
Developed, Open Space           0.191080
Deciduous Forest                0.079241
Pasture/Hay                     0.027165
Woody Wetlands                  0.026407
Mixed Forest                    0.022258
Developed High Intensity        0.015225
Cultivated Crops                0.013435
Emergent Herbaceous Wetlands    0.007118
Grassland/Herbaceous            0.004380
Barren Land (Rock/Sand/Clay)    0.002127
Evergreen Forest                0.001221
Name: proportion, dtype: float64

If we take the future minus the past, we can look at relative change

Code

future_lulc[future_lulc.year==2091].value.value_counts(normalize=True) -  future_lulc[future_lulc.year==2019].value.value_counts(normalize=True)

value
Barren Land (Rock/Sand/Clay)   -0.001453
Cultivated Crops               -0.007623
Deciduous Forest               -0.024469
Developed High Intensity        0.006696
Developed, Low Intensity       -0.002695
Developed, Medium Intensity     0.049971
Developed, Open Space           0.000758
Emergent Herbaceous Wetlands   -0.002548
Evergreen Forest               -0.000147
Grassland/Herbaceous           -0.001369
Mixed Forest                   -0.002738
Pasture/Hay                    -0.006317
Woody Wetlands                 -0.008065
Name: proportion, dtype: float64

This would suggest a continued trend toward urbanization, with particularly large relative losses in deciduous forest. Low-intensity development is decreasing, but medium-intensity seems to be growing strongly. Following this analysis, some natural questions that arise include, what would it mean (in social, economic, environmental, etc.) terms to have a 5% increase in medium-intensity development and a 2.5% reduction in deciduous forest? What are the implications for climate change, anticipated housing growth, traffic congestion, change in tax revenue, and so on? And how might these conditions differ under a different scenario, such as one where we have alternative land regulations or urban growth constraints?

These questions nudge us further in the direction of simulation modeling and scenario planning, which are exceedingly useful for medium and large-scale urban planning exercises (Avin, 2007; Avin & Goodspeed, 2020; Chakraborty et al., 2011; Chakraborty & McMillan, 2015; Knaap et al., 2020).

Avin, U. (2007). Using scenarios to make urban plans. Engaging the Future: Forecasts, Scenarios, Plans, and Projects, 103–134.

Avin, U., & Goodspeed, R. (2020). Using Exploratory Scenarios in Planning Practice: A Spectrum of Approaches. Journal of the American Planning Association, 0(0), 1–14. https://doi.org/10.1080/01944363.2020.1746688

Brail, R. K. (Ed.). (2008). Planning support systems for cities and regions. Lincoln Institute of Land Policy.

Chakraborty, A., Kaza, N., Knaap, G.-J., & Deal, B. (2011). Robust Plans and Contingent Plans: Scenario Planning for an Uncertain World. Journal of the American Planning Association, 77(3), 251–266. https://doi.org/10.1080/01944363.2011.582394

Chakraborty, A., & McMillan, A. (2015). Scenario Planning for Urban Planners: Toward a Practitioner’s Guide. Journal of the American Planning Association, 81(1), 18–29. https://doi.org/10.1080/01944363.2015.1038576

Dietzel, C., & Clarke, K. (2006). The effect of disaggregating land use categories in cellular automata during model calibration and forecasting. Computers, Environment and Urban Systems, 30(1), 78–101. https://doi.org/10.1016/j.compenvurbsys.2005.04.001

Dietzel, C., & Clarke, K. C. (2007). Toward Optimal Calibration of the SLEUTH Land Use Change Model. Transactions in GIS, 11(1), 29–45. https://doi.org/10.1111/j.1467-9671.2007.01031.x

Guan, Q., & Clarke, K. C. (2010). A general-purpose parallel raster processing programming library test application using a geographic cellular automata model. International Journal of Geographical Information Science, 24(5), 695–722. https://doi.org/10.1080/13658810902984228

Jantz, C. A., Goetz, S. J., Donato, D., & Claggett, P. (2010). Designing and implementing a regional urban modeling system using the SLEUTH cellular urban model. Computers, Environment and Urban Systems, 34(1), 1–16. https://doi.org/10.1016/j.compenvurbsys.2009.08.003

Knaap, G., Engelberg, D., Avin, U., Erdogan, S., Ducca, F., Welch, T. F., Finio, N., Moeckel, R., & Shahumyan, H. (2020). Modeling Sustainability Scenarios in the Baltimore–Washington (DC) Region. Journal of the American Planning Association, 0(0), 1–14. https://doi.org/10.1080/01944363.2019.1680311