Spatial block cross-validation

This short demo shows how to compare spatial regression models with bayespecon.diagnostics.spatial_kfold, a refit-based spatial k-fold predictive evaluator following Roberts et al. (2017).

For each fold, the model is refit on the training observations and the held-out fold’s log predictive density is evaluated under the full-data joint Gaussian implied by the model:

\[ \log p(y_{\text{test}} \mid y_{\text{train}}, \theta) = \tfrac{1}{2}\log|\Lambda_{tt}| - \tfrac{n_{\text{test}}}{2}\log(2\pi) - \tfrac{1}{2}\, z_{\text{test}}^{\top}\,\Lambda_{tt}^{-1}\, z_{\text{test}}, \]

with \(\Lambda = A^{\top}A/\sigma^{2}\) where \(A = I - \rho W\) (SAR / SDM), \(A = I - \lambda W\) (SEM / SDEM), or \(A = I\) (OLS / SLX), and \(z = \Lambda(y - \mu)\). This avoids the well-known failure modes of PSIS-LOO on spatially dependent data.

Known-truth setup

We simulate from a SAR data-generating process on a regular grid so the “correct” model is known in advance. A well-behaved CV criterion should prefer the SAR over the misspecified OLS, SLX, and SEM alternatives.

import numpy as np
import pandas as pd
from libpysal.graph import Graph

from bayespecon.dgp import simulate_sar
from bayespecon.diagnostics import spatial_kfold
from bayespecon.models import OLS, SAR, SEM, SLX

SEED = 0
GRID = 12  # 12 x 12 = 144 cells
RHO_TRUE = 0.6
BETA_TRUE = np.array([1.0, 2.0, -1.5])

gdf = simulate_sar(
    n=GRID,
    rho=RHO_TRUE,
    beta=BETA_TRUE,
    sigma=1.0,
    seed=SEED,
    create_gdf=True,
    geometry_type="polygon",
)

W = Graph.build_contiguity(gdf, rook=True).transform("r")
print(f"n = {len(gdf)}, true rho = {RHO_TRUE}, columns = {list(gdf.columns)}")
n = 144, true rho = 0.6, columns = ['y', 'X_0', 'X_1', 'X_2', 'geometry']

Fit candidate models

We fit four candidates — only the SAR matches the DGP — using short MCMC runs sufficient for a pedagogical example.

FIT_KW = dict(draws=400, tune=400, chains=2, random_seed=SEED, progressbar=False)
FORMULA = "y ~ X_1 + X_2"

models = {
    "OLS": OLS(formula=FORMULA, data=gdf, W=W),
    "SLX": SLX(formula=FORMULA, data=gdf, W=W),
    "SAR": SAR(formula=FORMULA, data=gdf, W=W, logdet_method="eigenvalue"),
    "SEM": SEM(formula=FORMULA, data=gdf, W=W, logdet_method="eigenvalue"),
}
for name, m in models.items():
    m.fit(**FIT_KW)
    print(f"  fit {name}")
  fit OLS
  fit SLX
  fit SAR
  fit SEM
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [beta, sigma2]
Sampling 2 chains for 400 tune and 400 draw iterations (800 + 800 draws total) took 5 seconds.
We recommend running at least 4 chains for robust computation of convergence diagnostics
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [beta, sigma2]
Sampling 2 chains for 400 tune and 400 draw iterations (800 + 800 draws total) took 5 seconds.
We recommend running at least 4 chains for robust computation of convergence diagnostics

Spatial block CV

spatial_kfold derives 5 spatial blocks from the cell centroids via KMeans, refits each model on each train fold, and accumulates the held-out elpd. Higher elpd is better.

CV_KW = dict(draws=200, tune=200, chains=1, random_seed=SEED, progressbar=False)

results = {}
for name, m in models.items():
    results[name] = spatial_kfold(m, geometry=gdf.geometry, n_blocks=5, **CV_KW)

summary = pd.DataFrame(
    {
        "elpd": [r.elpd for r in results.values()],
        "se": [r.se for r in results.values()],
        "n_folds": [r.n_folds for r in results.values()],
    },
    index=list(results.keys()),
).sort_values("elpd", ascending=False)
summary["delta_elpd"] = summary["elpd"] - summary["elpd"].max()
summary
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (1 chains in 1 job)
NUTS: [beta, sigma2]
Sampling 1 chain for 200 tune and 200 draw iterations (200 + 200 draws total) took 0 seconds.
Only one chain was sampled, this makes it impossible to run some convergence checks
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (1 chains in 1 job)
NUTS: [beta, sigma2]
Sampling 1 chain for 200 tune and 200 draw iterations (200 + 200 draws total) took 0 seconds.
Only one chain was sampled, this makes it impossible to run some convergence checks
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (1 chains in 1 job)
NUTS: [beta, sigma2]
Sampling 1 chain for 200 tune and 200 draw iterations (200 + 200 draws total) took 0 seconds.
Only one chain was sampled, this makes it impossible to run some convergence checks
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (1 chains in 1 job)
NUTS: [beta, sigma2]
Sampling 1 chain for 200 tune and 200 draw iterations (200 + 200 draws total) took 0 seconds.
Only one chain was sampled, this makes it impossible to run some convergence checks
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (1 chains in 1 job)
NUTS: [beta, sigma2]
Sampling 1 chain for 200 tune and 200 draw iterations (200 + 200 draws total) took 0 seconds.
Only one chain was sampled, this makes it impossible to run some convergence checks
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (1 chains in 1 job)
NUTS: [beta, sigma2]
Sampling 1 chain for 200 tune and 200 draw iterations (200 + 200 draws total) took 0 seconds.
Only one chain was sampled, this makes it impossible to run some convergence checks
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (1 chains in 1 job)
NUTS: [beta, sigma2]
Sampling 1 chain for 200 tune and 200 draw iterations (200 + 200 draws total) took 0 seconds.
Only one chain was sampled, this makes it impossible to run some convergence checks
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (1 chains in 1 job)
NUTS: [beta, sigma2]
Sampling 1 chain for 200 tune and 200 draw iterations (200 + 200 draws total) took 0 seconds.
Only one chain was sampled, this makes it impossible to run some convergence checks
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (1 chains in 1 job)
NUTS: [beta, sigma2]
Sampling 1 chain for 200 tune and 200 draw iterations (200 + 200 draws total) took 0 seconds.
Only one chain was sampled, this makes it impossible to run some convergence checks
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (1 chains in 1 job)
NUTS: [beta, sigma2]
Sampling 1 chain for 200 tune and 200 draw iterations (200 + 200 draws total) took 0 seconds.
Only one chain was sampled, this makes it impossible to run some convergence checks
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/bayespecon/diagnostics/spatial_cv.py:139: UserWarning: W does not appear to be row-standardised (row sums ≠ 1). Most spatial models assume W is row-standardised; results may be unreliable otherwise. For a scipy sparse matrix normalise rows manually (divide each row by its sum). To use a libpysal.graph.Graph set its transformation attribute: graph = graph.transform('r').
  new = model.__class__(
elpd se n_folds delta_elpd
SAR -210.761472 1.695869 5 0.000000
SLX -235.794173 2.310836 5 -25.032701
SEM -239.662285 1.398067 5 -28.900813
OLS -282.428976 2.467718 5 -71.667504

With data simulated from a SAR DGP, spatial block CV should rank SAR on top, with the misspecified alternatives trailing behind. The delta_elpd column shows the elpd difference relative to the best model — values close to zero indicate near-equivalent predictive performance, while large negative values indicate worse out-of-block prediction.