Bayesian Spatial Models: A Pedagogical Walkthrough

This notebook demonstrates all five models currently implemented in bayespecon:

  1. SLX

  2. SAR

  3. SEM

  4. SDM

  5. SDEM

Each section explains the model idea, fits the model, and inspects posterior summaries and spatial effects.

Conceptual Roadmap

Let \(W\) denote the spatial weights matrix and \(WX\) the spatial lag of covariates.

  • SLX: \(y = X\beta + WX\theta + \epsilon\)

  • SAR: \(y = \rho Wy + X\beta + \epsilon\)

  • SEM: \(y = X\beta + u\), \(u = \lambda Wu + \epsilon\)

  • SDM: \(y = \rho Wy + X\beta + WX\theta + \epsilon\)

  • SDEM: \(y = X\beta + WX\theta + u\), \(u = \lambda Wu + \epsilon\)

In bayespecon, all models accept either:

  • formula mode: model_class(formula=..., data=..., W=...), or

  • matrix mode: model_class(y=..., X=..., W=...).

import arviz as az
import geodatasets
import geopandas as gpd
import matplotlib.pyplot as plt
import pandas as pd
from libpysal.graph import Graph

from bayespecon import OLS, SAR, SDEM, SDM, SEM, SLX
from bayespecon.diagnostics.bayesfactor import bayes_factor_compare_models

az.style.use("arviz-white")
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
# Load sample spatial dataset
gdf = gpd.read_file(geodatasets.data.geoda.airbnb.url)
xcols = ["poverty", "rev_rating", "num_spots", "crowded"]
ycol = "price_pp"
gdf = gdf.dropna(subset=xcols + [ycol]).copy()

# Build contiguity graph (queen)
W = Graph.build_contiguity(gdf, rook=False).transform("r")

print(f"Observations: {len(gdf)}")
print(f"Predictors: {xcols}")
Observations: 67
Predictors: ['poverty', 'rev_rating', 'num_spots', 'crowded']

Helper Functions

To keep each model section focused, we use utility functions to:

  • fit with consistent MCMC settings,

  • print posterior summaries,

  • tabulate direct/indirect/total effects.

For a quick pedagogical run, we use small draw counts. Increase these for real inference.

def fit_and_report(model_cls, formula, data, W, draws=400, tune=400, chains=4, seed=42):
    """Fit a spatial model and return (model, summary, effects_df).

    Uses minimal MCMC settings for pedagogical demonstration.
    Increase draws/tune for real analyses.
    """
    model = model_cls(formula=formula, data=data, W=W, logdet_method="eigenvalue")
    model.fit(
        draws=draws,
        tune=tune,
        chains=chains,
        target_accept=0.9,
        random_seed=seed,
        progressbar=False,
        idata_kwargs={"log_likelihood": True},
    )
    summary = model.summary(round_to=3)
    effects_df = pd.DataFrame(model.spatial_effects())
    return model, summary, effects_df

Convergence Diagnostics Helper

For each fitted model, we will inspect:

  • r_hat (target close to 1.00)

  • effective sample sizes (ess_bulk, ess_tail)

  • trace plots for key parameters.

def diagnostics_table(idata, var_names):
    """Show key MCMC diagnostics for the given parameters."""
    cols = ["mean", "sd", "ess_bulk", "ess_tail", "r_hat"]
    return az.summary(idata, var_names=var_names, round_to=3)[cols]


def show_trace(idata, var_names, title):
    """Plot trace plots for the given parameters."""
    az.plot_trace(idata, var_names=var_names)
    plt.suptitle(title, y=1.02)
    plt.tight_layout()
    plt.show()

Models

0) OLS: What could go wrong?

Model:

\[ y = X\beta + \epsilon \]
ols = OLS(
    formula="price_pp ~ poverty + rev_rating + num_spots + crowded",
    data=gdf,
    W=W,
)
olsfit = ols.fit(
    draws=400,
    tune=400,
    chains=4,
    target_accept=0.9,
    random_seed=42,
    progressbar=False,
    idata_kwargs={"log_likelihood": True},
)
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 2 jobs)
NUTS: [beta, sigma]
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/pymc/step_methods/hmc/quadpotential.py:316: RuntimeWarning: overflow encountered in dot
  return 0.5 * np.dot(x, v_out)
Sampling 4 chains for 400 tune and 400 draw iterations (1_600 + 1_600 draws total) took 17 seconds.
ols.summary()
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
Intercept -9.492 85.004 -163.973 154.189 3.221 2.087 694.0 835.0 1.01
poverty 0.272 0.343 -0.365 0.940 0.010 0.009 1142.0 886.0 1.00
rev_rating 0.845 0.891 -0.877 2.466 0.033 0.022 709.0 817.0 1.01
num_spots 0.119 0.023 0.079 0.162 0.001 0.001 1279.0 1066.0 1.00
crowded -2.261 0.922 -4.075 -0.622 0.026 0.024 1308.0 1107.0 1.00
sigma 26.243 2.212 22.556 30.772 0.063 0.060 1227.0 927.0 1.01

1) SLX: Spatially Lagged Covariates Only

Model:

\[ y = X\beta + WX\theta + \epsilon \]

Interpretation:

  • beta captures local covariate effects.

  • theta captures neighbor-covariate spillovers.

  • No spatial lag on \(y\), so no autoregressive propagation through outcomes.

slx, summary_slx, effects_slx = fit_and_report(
    SLX,
    formula="price_pp ~ poverty + rev_rating + num_spots + crowded",
    data=gdf,
    W=W,
)
display(summary_slx)
display(effects_slx)
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 2 jobs)
NUTS: [beta, sigma]
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/pymc/step_methods/hmc/quadpotential.py:316: RuntimeWarning: overflow encountered in dot
  return 0.5 * np.dot(x, v_out)
Sampling 4 chains for 400 tune and 400 draw iterations (1_600 + 1_600 draws total) took 28 seconds.
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
Intercept 198.403 214.440 -216.269 579.810 8.396 6.091 650.362 734.798 1.005
poverty -0.655 0.455 -1.486 0.222 0.016 0.013 826.660 854.663 1.000
rev_rating 0.265 0.911 -1.323 2.049 0.031 0.030 885.762 734.053 1.009
num_spots 0.019 0.030 -0.034 0.078 0.001 0.001 774.424 820.683 1.000
crowded -1.078 1.176 -3.237 1.093 0.039 0.028 893.113 1080.138 1.001
W*poverty 1.148 0.692 -0.209 2.357 0.026 0.020 709.801 834.838 1.003
W*rev_rating -1.776 1.847 -4.915 2.032 0.068 0.046 731.538 815.410 1.004
W*num_spots 0.203 0.048 0.108 0.288 0.002 0.001 766.181 782.735 1.002
W*crowded -1.390 1.666 -4.519 1.585 0.060 0.042 769.779 983.167 1.001
sigma 23.717 2.027 20.246 27.729 0.065 0.061 1021.307 614.013 1.007
direct direct_ci_lower direct_ci_upper direct_pvalue indirect indirect_ci_lower indirect_ci_upper indirect_pvalue total total_ci_lower total_ci_upper total_pvalue
variable
poverty -0.655270 -1.563164 0.233338 0.14625 1.147955 -0.178487 2.545777 0.0925 0.492684 -0.524654 1.502410 0.34625
rev_rating 0.264813 -1.446970 2.049883 0.77250 -1.776420 -5.514831 1.733292 0.3300 -1.511607 -5.711824 2.936374 0.50500
num_spots 0.019149 -0.038408 0.078050 0.52750 0.203178 0.108604 0.297708 0.0000 0.222327 0.159254 0.286293 0.00000
crowded -1.078246 -3.258292 1.271095 0.36250 -1.390266 -4.513828 1.821181 0.3975 -2.468513 -4.821151 0.096495 0.06000
az.plot_forest(slx.inference_data)
display(diagnostics_table(slx.inference_data, ["beta", "sigma"]))
show_trace(slx.inference_data, ["sigma"], "SLX Trace: sigma")
mean sd ess_bulk ess_tail r_hat
beta[Intercept] 198.403 214.440 650.362 734.798 1.005
beta[poverty] -0.655 0.455 826.660 854.663 1.000
beta[rev_rating] 0.265 0.911 885.762 734.053 1.009
beta[num_spots] 0.019 0.030 774.424 820.683 1.000
beta[crowded] -1.078 1.176 893.113 1080.138 1.001
beta[W*poverty] 1.148 0.692 709.801 834.838 1.003
beta[W*rev_rating] -1.776 1.847 731.538 815.410 1.004
beta[W*num_spots] 0.203 0.048 766.181 782.735 1.002
beta[W*crowded] -1.390 1.666 769.779 983.167 1.001
sigma 23.717 2.027 1021.307 614.013 1.007
/tmp/ipykernel_7113/3581438262.py:11: UserWarning: The figure layout has changed to tight
  plt.tight_layout()
../_images/de37ce9758acb47e8bf3e16573eaff3041d30ff317a8831cea8f59d6765aa28b.png ../_images/d43838c8eb95f9ee27237559e6b86c85b5904f99ad7a9a09a812cbfa01399b5b.png

2) SAR: Spatial Lag of Outcome

Model:

\[ y = \rho Wy + X\beta + \epsilon \]

Interpretation:

  • \(\rho\) controls feedback from neighboring outcomes.

  • Effects are amplified through \((I - \rho W)^{-1}\), so direct and indirect impacts differ from raw \(\beta\).

sar, summary_sar, effects_sar = fit_and_report(
    SAR,
    formula="price_pp ~ poverty + rev_rating + num_spots + crowded",
    data=gdf,
    W=W,
)
display(summary_sar)
display(effects_sar)
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 2 jobs)
NUTS: [rho, beta, sigma]
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/pymc/step_methods/hmc/quadpotential.py:316: RuntimeWarning: overflow encountered in dot
  return 0.5 * np.dot(x, v_out)
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/pymc/step_methods/hmc/quadpotential.py:316: RuntimeWarning: overflow encountered in dot
  return 0.5 * np.dot(x, v_out)
Sampling 4 chains for 400 tune and 400 draw iterations (1_600 + 1_600 draws total) took 17 seconds.
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
Intercept -49.276 81.126 -199.623 104.468 3.013 2.155 725.708 865.937 1.006
poverty 0.102 0.299 -0.423 0.706 0.008 0.006 1324.847 1219.618 1.001
rev_rating 0.969 0.842 -0.740 2.418 0.031 0.022 745.602 874.384 1.004
num_spots 0.076 0.025 0.034 0.126 0.001 0.001 1076.119 1015.372 1.000
crowded -1.741 0.860 -3.355 -0.230 0.024 0.020 1295.382 873.380 1.001
rho 0.443 0.139 0.185 0.693 0.004 0.003 991.221 891.549 1.000
sigma 24.339 2.156 20.406 28.327 0.063 0.059 1175.325 1113.562 1.003
direct direct_ci_lower direct_ci_upper direct_pvalue indirect indirect_ci_lower indirect_ci_upper indirect_pvalue total total_ci_lower total_ci_upper total_pvalue
variable
poverty 0.107294 -0.506917 0.740681 0.73000 0.069012 -0.503319 0.688781 0.73375 0.176306 -0.987863 1.393020 0.73000
rev_rating 1.035809 -0.729173 2.809097 0.24250 0.871229 -0.494138 3.531787 0.24625 1.907038 -1.180677 6.143653 0.24250
num_spots 0.080435 0.033317 0.130604 0.00000 0.059244 0.015922 0.131901 0.00375 0.139679 0.062623 0.240899 0.00000
crowded -1.847340 -3.563650 -0.176338 0.02875 -1.399176 -3.965908 -0.094650 0.03250 -3.246516 -6.958233 -0.359249 0.02875
az.plot_forest(sar.inference_data)
display(diagnostics_table(sar.inference_data, ["rho", "beta", "sigma"]))
show_trace(sar.inference_data, ["rho", "sigma"], "SAR Trace: rho, sigma")
mean sd ess_bulk ess_tail r_hat
rho 0.443 0.139 991.221 891.549 1.000
beta[Intercept] -49.276 81.126 725.708 865.937 1.006
beta[poverty] 0.102 0.299 1324.847 1219.618 1.001
beta[rev_rating] 0.969 0.842 745.602 874.384 1.004
beta[num_spots] 0.076 0.025 1076.119 1015.372 1.000
beta[crowded] -1.741 0.860 1295.382 873.380 1.001
sigma 24.339 2.156 1175.325 1113.562 1.003
/tmp/ipykernel_7113/3581438262.py:11: UserWarning: The figure layout has changed to tight
  plt.tight_layout()
../_images/04da705a9e3edae651dfdc1a68c2626693e4abb3ea0d2ab380810a4a75590add.png ../_images/7607981501c19380f700972d31f9f9f00d86fd0923805f5af58f73b5b4417d2c.png

3) SEM: Spatial Correlation in Errors

Model:

\[ y = X\beta + u, \quad u = \lambda Wu + \epsilon \]

Interpretation:

  • Spatial dependence is moved to latent shocks rather than outcome feedback.

  • Useful when omitted spatial factors induce correlated residuals.

sem, summary_sem, effects_sem = fit_and_report(
    SEM,
    formula="price_pp ~ poverty + rev_rating + num_spots + crowded",
    data=gdf,
    W=W,
)
display(summary_sem)
display(effects_sem)
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 2 jobs)
NUTS: [lam, beta, sigma]
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/pymc/step_methods/hmc/quadpotential.py:316: RuntimeWarning: overflow encountered in dot
  return 0.5 * np.dot(x, v_out)
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/pymc/step_methods/hmc/quadpotential.py:316: RuntimeWarning: overflow encountered in dot
  return 0.5 * np.dot(x, v_out)
Sampling 4 chains for 400 tune and 400 draw iterations (1_600 + 1_600 draws total) took 14 seconds.
There were 4 divergences after tuning. Increase `target_accept` or reparameterize.
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
Intercept -27.196 77.808 -179.249 107.308 3.122 2.412 621.425 726.421 1.008
poverty -0.111 0.449 -0.914 0.770 0.015 0.010 942.050 1069.055 1.004
rev_rating 1.119 0.826 -0.317 2.691 0.033 0.025 619.239 717.234 1.008
num_spots 0.072 0.034 0.007 0.132 0.001 0.001 766.505 862.262 1.003
crowded -1.818 1.158 -4.037 0.361 0.036 0.028 1039.232 972.899 1.004
lam 0.505 0.179 0.118 0.790 0.008 0.005 556.334 475.293 1.003
sigma 25.043 2.227 21.051 29.174 0.068 0.057 1065.389 1073.024 1.002
direct direct_ci_lower direct_ci_upper direct_pvalue indirect indirect_ci_lower indirect_ci_upper indirect_pvalue total total_ci_lower total_ci_upper total_pvalue
variable
poverty -0.110960 -1.048006 0.741693 0.84500 0.0 0.0 0.0 0.0 -0.110960 -1.048006 0.741693 0.84500
rev_rating 1.119062 -0.503201 2.713468 0.16125 0.0 0.0 0.0 0.0 1.119062 -0.503201 2.713468 0.16125
num_spots 0.071912 0.005288 0.135631 0.03875 0.0 0.0 0.0 0.0 0.071912 0.005288 0.135631 0.03875
crowded -1.817735 -4.069739 0.496977 0.13250 0.0 0.0 0.0 0.0 -1.817735 -4.069739 0.496977 0.13250
az.plot_forest(sem.inference_data)
display(diagnostics_table(sem.inference_data, ["lam", "beta", "sigma"]))
show_trace(sem.inference_data, ["lam", "sigma"], "SEM Trace: lam, sigma")
mean sd ess_bulk ess_tail r_hat
lam 0.505 0.179 556.334 475.293 1.003
beta[Intercept] -27.196 77.808 621.425 726.421 1.008
beta[poverty] -0.111 0.449 942.050 1069.055 1.004
beta[rev_rating] 1.119 0.826 619.239 717.234 1.008
beta[num_spots] 0.072 0.034 766.505 862.262 1.003
beta[crowded] -1.818 1.158 1039.232 972.899 1.004
sigma 25.043 2.227 1065.389 1073.024 1.002
/tmp/ipykernel_7113/3581438262.py:11: UserWarning: The figure layout has changed to tight
  plt.tight_layout()
../_images/982365ae89596a4ce5b4c41aeec35fe2ab38368bafbc7e62b71acfd9b36e984c.png ../_images/2d3d775a2cc23a4de02106a53b2b585e5058d0bb62cee7ee134e7166773c0797.png

4) SDM: SAR + SLX

Model:

\[ y = \rho Wy + X\beta + WX\theta + \epsilon \]

Interpretation:

  • Includes both outcome feedback and neighbor-covariate channels.

  • Often used as a flexible nesting model for SAR/SLX.

sdm, summary_sdm, effects_sdm = fit_and_report(
    SDM,
    formula="price_pp ~ poverty + rev_rating + num_spots + crowded",
    data=gdf,
    W=W,
)
display(summary_sdm)
display(effects_sdm)
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 2 jobs)
NUTS: [rho, beta, sigma]
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/pymc/step_methods/hmc/quadpotential.py:316: RuntimeWarning: overflow encountered in dot
  return 0.5 * np.dot(x, v_out)
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/pymc/step_methods/hmc/quadpotential.py:316: RuntimeWarning: overflow encountered in dot
  return 0.5 * np.dot(x, v_out)
Sampling 4 chains for 400 tune and 400 draw iterations (1_600 + 1_600 draws total) took 30 seconds.
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
Intercept 197.552 203.064 -179.432 571.527 7.678 5.043 702.865 979.518 1.002
poverty -0.677 0.445 -1.530 0.133 0.015 0.012 895.828 803.585 1.003
rev_rating 0.378 0.907 -1.282 2.117 0.026 0.024 1209.213 1245.381 1.000
num_spots 0.016 0.031 -0.043 0.072 0.001 0.001 1185.523 1039.527 1.002
crowded -1.085 1.121 -3.363 0.781 0.040 0.027 788.075 839.000 1.000
W*poverty 1.029 0.665 -0.281 2.203 0.024 0.015 799.091 959.339 1.001
W*rev_rating -2.025 1.808 -5.366 1.271 0.065 0.043 782.291 1059.001 1.002
W*num_spots 0.170 0.052 0.073 0.267 0.002 0.001 968.196 1082.774 1.006
W*crowded -0.923 1.690 -4.083 2.335 0.059 0.040 834.732 813.964 1.001
rho 0.230 0.168 -0.096 0.529 0.005 0.004 1069.556 1056.787 1.001
sigma 23.193 1.968 19.979 27.426 0.063 0.049 1029.107 1201.134 1.006
direct direct_ci_lower direct_ci_upper direct_pvalue indirect indirect_ci_lower indirect_ci_upper indirect_pvalue total total_ci_lower total_ci_upper total_pvalue
variable
poverty -0.636300 -1.511830 0.204442 0.12625 1.074056 -0.512864 2.567043 0.17000 0.437756 -0.998291 1.783824 0.5100
rev_rating 0.254095 -1.568348 2.156596 0.77750 -2.573229 -7.978988 2.106844 0.31125 -2.319134 -8.947258 3.287885 0.4625
num_spots 0.025322 -0.035468 0.082774 0.40250 0.220444 0.116234 0.350051 0.00000 0.245765 0.162914 0.361073 0.0000
crowded -1.145194 -3.223130 0.967304 0.27250 -1.474455 -5.410931 2.346805 0.44500 -2.619650 -6.107759 0.632563 0.1050
az.plot_forest(sdm.inference_data)
display(diagnostics_table(sdm.inference_data, ["rho", "beta", "sigma"]))
show_trace(sdm.inference_data, ["rho", "sigma"], "SDM Trace: rho, sigma")
mean sd ess_bulk ess_tail r_hat
rho 0.230 0.168 1069.556 1056.787 1.001
beta[Intercept] 197.552 203.064 702.865 979.518 1.002
beta[poverty] -0.677 0.445 895.828 803.585 1.003
beta[rev_rating] 0.378 0.907 1209.213 1245.381 1.000
beta[num_spots] 0.016 0.031 1185.523 1039.527 1.002
beta[crowded] -1.085 1.121 788.075 839.000 1.000
beta[W*poverty] 1.029 0.665 799.091 959.339 1.001
beta[W*rev_rating] -2.025 1.808 782.291 1059.001 1.002
beta[W*num_spots] 0.170 0.052 968.196 1082.774 1.006
beta[W*crowded] -0.923 1.690 834.732 813.964 1.001
sigma 23.193 1.968 1029.107 1201.134 1.006
/tmp/ipykernel_7113/3581438262.py:11: UserWarning: The figure layout has changed to tight
  plt.tight_layout()
../_images/e84d1a037e5c4a357fb1e8c0fd21bdb9dd2f40e0baadbfbbb7d93d68b7588f87.png ../_images/b718ef581465fe9f4f96d57f4885029bfa65d6d0325854838717efb76dae51fd.png

5) SDEM: SLX + Spatial Error

Model:

\[ y = X\beta + WX\theta + u, \quad u = \lambda Wu + \epsilon \]

Interpretation:

  • Neighbor covariates matter directly (through \(WX\)),

  • and unmodeled shocks are spatially correlated (through \(u\)).

sdem, summary_sdem, effects_sdem = fit_and_report(
    SDEM,
    formula="price_pp ~ poverty + rev_rating + num_spots + crowded",
    data=gdf,
    W=W,
)
display(summary_sdem)
display(effects_sdem)
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 2 jobs)
NUTS: [lam, beta, sigma]
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/pymc/step_methods/hmc/quadpotential.py:316: RuntimeWarning: overflow encountered in dot
  return 0.5 * np.dot(x, v_out)
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/pymc/step_methods/hmc/quadpotential.py:316: RuntimeWarning: overflow encountered in dot
  return 0.5 * np.dot(x, v_out)
Sampling 4 chains for 400 tune and 400 draw iterations (1_600 + 1_600 draws total) took 26 seconds.
There were 6 divergences after tuning. Increase `target_accept` or reparameterize.
The rhat statistic is larger than 1.01 for some parameters. This indicates problems during sampling. See https://arxiv.org/abs/1903.08008 for details
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
Intercept 250.173 240.167 -213.748 661.775 11.533 6.177 437.557 910.985 1.016
poverty -0.659 0.424 -1.493 0.068 0.013 0.009 1055.612 1061.682 1.008
rev_rating 0.244 0.966 -1.639 2.071 0.036 0.022 739.208 1085.292 1.010
num_spots 0.028 0.030 -0.034 0.081 0.001 0.001 1042.504 967.294 1.002
crowded -1.096 1.102 -3.145 1.056 0.033 0.024 1122.155 994.902 1.003
W*poverty 0.954 0.695 -0.356 2.230 0.025 0.016 794.642 921.693 1.003
W*rev_rating -2.257 1.909 -5.604 1.352 0.088 0.046 470.999 826.123 1.013
W*num_spots 0.187 0.052 0.090 0.282 0.002 0.001 944.282 1022.535 1.005
W*crowded -1.579 1.876 -5.151 1.919 0.059 0.051 1026.080 1274.800 1.001
lam 0.324 0.191 -0.045 0.663 0.006 0.004 987.384 1073.611 1.005
sigma 23.302 1.983 19.909 27.125 0.056 0.050 1268.999 1167.931 1.002
direct direct_ci_lower direct_ci_upper direct_pvalue indirect indirect_ci_lower indirect_ci_upper indirect_pvalue total total_ci_lower total_ci_upper total_pvalue
variable
poverty -0.659241 -1.458998 0.215810 0.12125 0.953790 -0.428991 2.285650 0.17750 0.294549 -1.077492 1.567065 0.66125
rev_rating 0.244189 -1.765513 2.127001 0.78250 -2.257267 -6.068413 1.342616 0.25125 -2.013078 -7.167563 2.545931 0.43125
num_spots 0.027988 -0.035114 0.086002 0.32500 0.186729 0.087106 0.290087 0.00000 0.214717 0.124809 0.302384 0.00000
crowded -1.096143 -3.246375 1.104345 0.32875 -1.578919 -5.224640 2.135995 0.38000 -2.675061 -5.847786 0.948444 0.11125
az.plot_forest(sdem.inference_data)
display(diagnostics_table(sdem.inference_data, ["lam", "beta", "sigma"]))
show_trace(sdem.inference_data, ["lam", "sigma"], "SDEM Trace: lam, sigma")
mean sd ess_bulk ess_tail r_hat
lam 0.324 0.191 987.384 1073.611 1.005
beta[Intercept] 250.173 240.167 437.557 910.985 1.016
beta[poverty] -0.659 0.424 1055.612 1061.682 1.008
beta[rev_rating] 0.244 0.966 739.208 1085.292 1.010
beta[num_spots] 0.028 0.030 1042.504 967.294 1.002
beta[crowded] -1.096 1.102 1122.155 994.902 1.003
beta[W*poverty] 0.954 0.695 794.642 921.693 1.003
beta[W*rev_rating] -2.257 1.909 470.999 826.123 1.013
beta[W*num_spots] 0.187 0.052 944.282 1022.535 1.005
beta[W*crowded] -1.579 1.876 1026.080 1274.800 1.001
sigma 23.302 1.983 1268.999 1167.931 1.002
/tmp/ipykernel_7113/3581438262.py:11: UserWarning: The figure layout has changed to tight
  plt.tight_layout()
../_images/12db46bd7c268aa1aaf6020a4ef8665734ebe0988d67f4a42a06b75bea6d1eed.png ../_images/eecc07267dd4129f2d5ab9708a1da0b3d22431c9ea94bd030485613e64ef2be0.png

MCMC Sampling Adequacy

Bayesian estimation of spatial models has a well-known efficiency pitfall: the spatial dependence parameter \(\rho\) (or \(\lambda\)) often mixes slowly, so a chain that looks converged in its point estimate can still produce posterior credible intervals that are 10–12 % too narrow because the sampler has not visited the tails enough times [Wolf et al., 2018]. The spatial_mcmc_diagnostic helper checks effective sample size, sampler yield, \(\hat{R}\), and HPDI stability for the spatial parameter and warns when any threshold is violated.

from bayespecon import spatial_mcmc_diagnostic

# Run on the SDM (most parameters; both rho and impact summaries depend on it)
report = spatial_mcmc_diagnostic(sdm, emit_warnings=False)
report.to_frame()
ess_bulk ess_tail r_hat mcse_mean yield_pct hpdi_drift_pct adequate
parameter
rho 1069.556086 1056.786799 1.00133 0.005163 66.847255 3.827624 True
sdm.spatial_diagnostics()
statistic median df p_value ci_lower ci_upper
test
LM-Error-SDM 1.020552 0.407088 1 0.312388 0.001056 5.954121
Robust-LM-Error-SDM 4.832126 4.622864 1 0.027934 0.036720 11.323437
ols.spatial_diagnostics()
statistic median df p_value ci_lower ci_upper
test
LM-Lag 22.079712 10.588424 1 0.000003 0.015565 103.588908
LM-Error 4.040401 3.188498 1 0.044423 0.717507 12.567629
LM-SDM-Joint 3649.701192 1572.442319 5 0.000000 25.179629 18701.613847
LM-SLX-Error-Joint 3651.437896 1575.318771 5 0.000000 25.791326 18706.073180
Robust-LM-Lag 12.028354 11.394298 1 0.000524 5.623500 21.461866
Robust-LM-Error 5.906351 5.595007 1 0.015086 2.761339 10.538541
sar.spatial_diagnostics_decision()
../_images/31f211cdb7b189f785d7392694d8932af37c8a7f057e9b67f73ea58f12317596.svg
sem.spatial_diagnostics()
statistic median df p_value ci_lower ci_upper
test
LM-Lag 178.159853 58.530743 1 0.000000 0.106393 1024.410783
LM-WX 27369.491190 6859.925190 4 0.000000 68.184423 154209.397357
Robust-LM-Lag 1.972060 0.910857 1 0.160229 0.002683 9.301223
Robust-LM-WX 924.248356 446.509084 4 0.000000 12.548655 4578.160719
report
SpatialMCMCReport(parameters=['rho'], ess_bulk={'rho': 1069.556086208102}, ess_tail={'rho': 1056.7867993879127}, r_hat={'rho': 1.0013301817996323}, mcse_mean={'rho': 0.005163045813358309}, nominal_size=1600, yield_pct={'rho': 66.84725538800637}, hpdi_drift_pct={'rho': 3.8276241815940053}, warnings_triggered=[], adequate=True, adequate_by_param={'rho': True})

Model Comparison

# Collect all fitted models for comparison
model_dict = {
    "OLS": ols,
    "SLX": slx,
    "SAR": sar,
    "SEM": sem,
    "SDM": sdm,
    "SDEM": sdem,
}
idata_dict = {name: m.inference_data for name, m in model_dict.items()}

Bayes Factor Model Comparison

Bayes factors provide an alternative to information criteria for comparing competing models. While WAIC and LOO assess predictive performance, Bayes factors compare marginal likelihoods — the probability of the data under each model, integrated over all parameter values weighted by the prior:

\[BF_{ij} = \frac{p(y \mid \mathcal{M}_i)}{p(y \mid \mathcal{M}_j)} = \frac{ML_i}{ML_j}\]

This makes Bayes factors sensitive to the prior: models with diffuse priors on unnecessary parameters are penalized more heavily, because the marginal likelihood averages over all possible parameter values weighted by the prior. In spatial models, this means that models with many WX coefficients under wide priors (e.g., SLX, SDM, SDEM) can receive substantially lower marginal likelihoods than more parsimonious alternatives (e.g., OLS, SAR, SEM) — even if their log-likelihoods are similar.

Interpreting Bayes factors (Kass & Raftery, 1995):

BF range

Evidence strength

1 – 3

Anecdotal

3 – 10

Moderate

10 – 30

Strong

30 – 100

Very strong

> 100

Extreme

Method. We use bridge sampling (Meng & Wong, 1996) to estimate each model’s log marginal likelihood, following the R bridgesampling package (Gronau et al., 2020). The implementation uses ESS weighting, two-phase convergence, and MCSE diagnostics. For reliable estimates, 40,000+ posterior samples are recommended.

Important caveat. Bayes factors can be very sensitive to prior specification. Models with diffuse priors on many parameters will tend to have lower marginal likelihoods. This is by design — it is the Bayesian Occam’s razor at work — but it means that Bayes factors and information criteria may disagree when priors are uninformative.

bayes_factor_compare_models(model_dict, method="bridge").round(3)
/tmp/ipykernel_7113/3295012235.py:1: UserWarning: Bridge sampling with 1600 posterior samples for 'OLS' may yield imprecise marginal-likelihood estimates. A conservative rule of thumb is 40,000+ samples (Gronau, Singmann, & Wagenmakers, 2017).
  bayes_factor_compare_models(model_dict, method="bridge").round(3)
OLS SLX SAR SEM SDM SDEM
OLS 1.000 6.390332e+20 0.066 0.179 1.016243e+21 5.730165e+20
SLX 0.000 1.000000e+00 0.000 0.000 1.590000e+00 8.970000e-01
SAR 15.209 9.719181e+21 1.000 2.730 1.545624e+22 8.715118e+21
SEM 5.571 3.560225e+21 0.366 1.000 5.661763e+21 3.192428e+21
SDM 0.000 6.290000e-01 0.000 0.000 1.000000e+00 5.640000e-01
SDEM 0.000 1.115000e+00 0.000 0.000 1.773000e+00 1.000000e+00
bayes_factor_compare_models(model_dict, method="bic").round(3)
OLS SLX SAR SEM SDM SDEM
OLS 1.000 0.310 0.124 1.648 0.906 1.454
SLX 3.222 1.000 0.400 5.310 2.920 4.684
SAR 8.051 2.499 1.000 13.268 7.296 11.704
SEM 0.607 0.188 0.075 1.000 0.550 0.882
SDM 1.103 0.342 0.137 1.818 1.000 1.604
SDEM 0.688 0.213 0.085 1.134 0.623 1.000

Bridge Sampling vs. BIC

  • BIC: SAR > SLX > OLS > SEM (SAR wins, SLX is moderate, SEM is worst)

  • Bridge: SAR > SEM > OLS (SAR wins, SEM is moderate)

The two methods can produce qualitatively different model rankings. This is expected and reflects a fundamental difference in what they assume about priors:

  • BIC approximates \(\log(ML) \approx \hat\ell_{\max} - \frac{k}{2}\log(n)\), which assumes unit-information priors (priors containing as much information as a single observation). The penalty per parameter is fixed at \(\frac{1}{2}\log(n) \approx 2.1\) for \(n = 77\).

  • Bridge sampling integrates over the actual priors in the model. When priors are wide (e.g., Normal(0, 100) on WX coefficients), the marginal likelihood penalizes each such parameter by roughly \(\log(\sigma_{\text{prior}} / \sigma_{\text{post}})\), which can be 5–10× larger than the BIC penalty.

This is why models with many WX terms (SLX, SDM, SDEM) may look reasonable under BIC but receive extreme Bayes factors under bridge sampling: the wide priors on the WX coefficients are “wasted” — they spread probability mass over implausible parameter values, reducing the marginal likelihood. This is Bayesian Occam’s razor at work, and bridge sampling is generally more trustworthy because it accounts for the actual prior specification.

Model Comparison: WAIC and LOO

We compare fitted models using information criteria from ArviZ.

# WAIC and LOO comparison
for ic in ("waic", "loo"):
    cmp = az.compare(idata_dict, ic=ic, method="BB-pseudo-BMA")
    az.plot_compare(cmp)
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/arviz/stats/stats.py:1652: UserWarning: For one or more samples the posterior variance of the log predictive densities exceeds 0.4. This could be indication of WAIC starting to fail. 
See http://arxiv.org/abs/1507.04544 for details
  warnings.warn(
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/arviz/stats/stats.py:1652: UserWarning: For one or more samples the posterior variance of the log predictive densities exceeds 0.4. This could be indication of WAIC starting to fail. 
See http://arxiv.org/abs/1507.04544 for details
  warnings.warn(
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/arviz/stats/stats.py:1652: UserWarning: For one or more samples the posterior variance of the log predictive densities exceeds 0.4. This could be indication of WAIC starting to fail. 
See http://arxiv.org/abs/1507.04544 for details
  warnings.warn(
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/arviz/stats/stats.py:1652: UserWarning: For one or more samples the posterior variance of the log predictive densities exceeds 0.4. This could be indication of WAIC starting to fail. 
See http://arxiv.org/abs/1507.04544 for details
  warnings.warn(
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/arviz/stats/stats.py:1652: UserWarning: For one or more samples the posterior variance of the log predictive densities exceeds 0.4. This could be indication of WAIC starting to fail. 
See http://arxiv.org/abs/1507.04544 for details
  warnings.warn(
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/arviz/stats/stats.py:1652: UserWarning: For one or more samples the posterior variance of the log predictive densities exceeds 0.4. This could be indication of WAIC starting to fail. 
See http://arxiv.org/abs/1507.04544 for details
  warnings.warn(
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/arviz/stats/stats.py:782: UserWarning: Estimated shape parameter of Pareto distribution is greater than 0.69 for one or more samples. You should consider using a more robust model, this is because importance sampling is less likely to work well if the marginal posterior and LOO posterior are very different. This is more likely to happen with a non-robust model and highly influential observations.
  warnings.warn(
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/arviz/stats/stats.py:782: UserWarning: Estimated shape parameter of Pareto distribution is greater than 0.69 for one or more samples. You should consider using a more robust model, this is because importance sampling is less likely to work well if the marginal posterior and LOO posterior are very different. This is more likely to happen with a non-robust model and highly influential observations.
  warnings.warn(
../_images/416de188b21d5c7ab823ba3c73811c9497f2a2a7bc4595843dac843c5f10e204.png ../_images/f8cd396a313a17393f4f81fc5cd249c8fa1de49cc1c7564a8b4d33bbb842a8cf.png

Bayesian LM Specification Tests

The bayespecon.diagnostics module provides Bayesian Lagrange-multiplier tests that operate on posterior draws rather than point estimates. These replace the frequentist LM tests that were previously available.

Below we run the Bayesian LM tests from the fitted OLS model:

  • bayesian_lm_lag_test — tests \(H_0: \rho = 0\) (no spatial lag)

  • bayesian_lm_error_test — tests \(H_0: \lambda = 0\) (no spatial error)

  • bayesian_lm_wx_test — tests \(H_0: \gamma = 0\) (no WX, from SAR null)

  • bayesian_lm_sdm_joint_test — joint test for SDM (\(H_0: \rho = 0\) and \(\gamma = 0\))

  • bayesian_lm_slx_error_joint_test — joint test for SDEM (\(H_0: \lambda = 0\) and \(\gamma = 0\))

  • bayesian_robust_lm_lag_test / bayesian_robust_lm_error_test — Anselin–Florax robust pair (SAR vs SEM)

  • bayesian_robust_lm_lag_sdm_test / bayesian_robust_lm_wx_test / bayesian_robust_lm_error_sdem_test — Neyman-orthogonal robust tests

  • bayesian_lm_error_sdm_test / bayesian_lm_lag_sdem_test — SDM/SDEM-aware tests using correctly filtered residuals

In practice, prefer the method API: model.spatial_diagnostics() runs the tests wired to that model class, and model.spatial_diagnostics_decision() returns a recommended specification.

Each returns a BayesianLMTestResult with posterior summary statistics and a Bayesian p-value.

# `spatial_diagnostics()` runs all tests wired to a model class
# and returns a tidy DataFrame.
ols.spatial_diagnostics()
statistic median df p_value ci_lower ci_upper
test
LM-Lag 22.079712 10.588424 1 0.000003 0.015565 103.588908
LM-Error 4.040401 3.188498 1 0.044423 0.717507 12.567629
LM-SDM-Joint 3649.701192 1572.442319 5 0.000000 25.179629 18701.613847
LM-SLX-Error-Joint 3651.437896 1575.318771 5 0.000000 25.791326 18706.073180
Robust-LM-Lag 12.028354 11.394298 1 0.000524 5.623500 21.461866
Robust-LM-Error 5.906351 5.595007 1 0.015086 2.761339 10.538541

Compare Spatial Parameters Across Models

This cell compares posterior means/intervals for the spatial scalar where present:

  • rho in SAR/SDM

  • lam in SEM/SDEM

# Compare spatial parameters across models
spatial_rows = []
for name, model, var in [
    ("SAR", sar, "rho"),
    ("SEM", sem, "lam"),
    ("SDM", sdm, "rho"),
    ("SDEM", sdem, "lam"),
]:
    if var in model.inference_data.posterior:
        summary = az.summary(model.inference_data, var_names=[var], round_to=3)
        summary.insert(0, "model", name)
        spatial_rows.append(summary)

pd.concat(spatial_rows)
model mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
rho SAR 0.443 0.139 0.185 0.693 0.004 0.003 991.221 891.549 1.000
lam SEM 0.505 0.179 0.118 0.790 0.008 0.005 556.334 475.293 1.003
rho SDM 0.230 0.168 -0.096 0.529 0.005 0.004 1069.556 1056.787 1.001
lam SDEM 0.324 0.191 -0.045 0.663 0.006 0.004 987.384 1073.611 1.005

Matrix-Mode API Example (Optional)

The package also supports direct matrix inputs. This is useful if your design matrix is already engineered.

y = gdf[ycol]
X = gdf[xcols]

sar_matrix = SAR(y=y, X=X, W=W)
sar_matrix.fit(draws=200, tune=200, chains=2, random_seed=7, progressbar=False)
display(sar_matrix.summary(round_to=3))
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [rho, beta, sigma]
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/pymc/step_methods/hmc/quadpotential.py:316: RuntimeWarning: overflow encountered in dot
  return 0.5 * np.dot(x, v_out)
/home/runner/micromamba/envs/test/lib/python3.14/site-packages/pymc/step_methods/hmc/quadpotential.py:316: RuntimeWarning: overflow encountered in dot
  return 0.5 * np.dot(x, v_out)
Sampling 2 chains for 200 tune and 200 draw iterations (400 + 400 draws total) took 5 seconds.
We recommend running at least 4 chains for robust computation of convergence diagnostics
The rhat statistic is larger than 1.01 for some parameters. This indicates problems during sampling. See https://arxiv.org/abs/1903.08008 for details
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
poverty 0.051 0.295 -0.426 0.640 0.015 0.016 366.145 219.495 1.014
rev_rating 0.453 0.106 0.258 0.649 0.006 0.004 314.644 315.514 1.011
num_spots 0.080 0.023 0.038 0.120 0.001 0.001 350.772 254.789 1.005
crowded -1.719 0.812 -3.283 -0.198 0.045 0.036 322.926 278.203 1.003
rho 0.432 0.120 0.220 0.661 0.006 0.005 367.146 287.289 1.008
sigma 24.275 2.208 20.730 28.992 0.117 0.155 363.195 219.473 1.013