Nonlinear Spatial Models¶
This guide demonstrates three nonlinear spatial models in bayespecon on synthetic data.
Spatial Probit¶
For binary outcomes \(y_i \in \{0,1\}\), latent utilities \(y^*\) follow a spatial lag:
\[y^* = \rho W y^* + X\beta + a + \varepsilon, \quad \varepsilon \sim \mathcal{N}(0, I)\]
where \(a\) are region-level random effects and \(y_i = \mathbf{1}[y^*_i > 0]\).
SAR Tobit¶
For censored outcomes, the spatial autoregressive Tobit specifies:
\[y^* = \rho W y^* + X\beta + \varepsilon, \quad \varepsilon \sim \mathcal{N}(0, \sigma^2 I)\]
\[y_i = \max(c,\, y^*_i)\]
where \(c\) is the censoring threshold (default 0).
SEM Tobit¶
The spatial error Tobit places spatial dependence in the disturbance:
\[y^* = X\beta + u, \quad u = \lambda W u + \varepsilon, \quad \varepsilon \sim \mathcal{N}(0, \sigma^2 I)\]
\[y_i = \max(c,\, y^*_i)\]
import numpy as np
from libpysal.graph import Graph
from bayespecon import SARTobit, SEMTobit, SpatialProbit, dgp
rng = np.random.default_rng(1234)
# SpatialProbit
m, n_per_region = 8, 20
# Create line-graph weights: each node connected to neighbors
focal, neighbor = [], []
for i in range(m):
if i > 0:
focal.append(i)
neighbor.append(i - 1)
if i < m - 1:
focal.append(i)
neighbor.append(i + 1)
Wm = Graph.from_arrays(
np.array(focal), np.array(neighbor), weight=np.ones(len(focal))
).transform("r")
rho, beta, sigma_a = 0.35, np.array([0.3, 1.0]), 0.8
sp_data = dgp.simulate_spatial_probit(
W=Wm,
rho=rho,
beta=beta,
sigma_a=sigma_a,
n_per_region=n_per_region,
rng=rng,
)
sp = SpatialProbit(
y=sp_data["y"],
X=sp_data["X"],
W=sp_data["W_graph"],
region_ids=sp_data["region_ids"],
)
sp.fit(draws=300, tune=300, chains=2, random_seed=42, progressbar=False)
sp.summary(var_names=["rho", "beta", "sigma_a"])
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [rho, beta, sigma_a, a_raw]
Sampling 2 chains for 300 tune and 300 draw iterations (600 + 600 draws total) took 8 seconds.
There were 23 divergences after tuning. Increase `target_accept` or reparameterize.
We recommend running at least 4 chains for robust computation of convergence diagnostics
The rhat statistic is larger than 1.01 for some parameters. This indicates problems during sampling. See https://arxiv.org/abs/1903.08008 for details
The effective sample size per chain is smaller than 100 for some parameters. A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details
| mean | sd | hdi_3% | hdi_97% | mcse_mean | mcse_sd | ess_bulk | ess_tail | r_hat | |
|---|---|---|---|---|---|---|---|---|---|
| rho | 0.204 | 0.367 | -0.511 | 0.766 | 0.029 | 0.016 | 163.0 | 198.0 | 1.01 |
| x0 | 0.196 | 1.441 | -3.345 | 2.313 | 0.240 | 0.325 | 53.0 | 24.0 | 1.09 |
| x1 | 1.010 | 0.187 | 0.670 | 1.369 | 0.012 | 0.007 | 235.0 | 393.0 | 1.00 |
| sigma_a | 2.000 | 0.795 | 0.812 | 3.681 | 0.175 | 0.131 | 32.0 | 30.0 | 1.07 |
# SARTobit and SEMTobit
n = 30
# Create line-graph weights: each node connected to neighbors
focal, neighbor = [], []
for i in range(n):
if i > 0:
focal.append(i)
neighbor.append(i - 1)
if i < n - 1:
focal.append(i)
neighbor.append(i + 1)
W = Graph.from_arrays(
np.array(focal), np.array(neighbor), weight=np.ones(len(focal))
).transform("r")
beta = np.array([1.0, 1.5])
sigma, rho, lam = 0.8, 0.4, 0.35
sar_data = dgp.simulate_sar_tobit(
W=W,
rho=rho,
beta=beta,
sigma=sigma,
censoring=0.0,
rng=rng,
)
sar_tobit = SARTobit(
y=sar_data["y"],
X=sar_data["X"],
W=sar_data["W_graph"],
censoring=0.0,
)
sar_tobit.fit(draws=300, tune=300, chains=2, random_seed=42, progressbar=False)
sem_data = dgp.simulate_sem_tobit(
W=W,
lam=lam,
beta=beta,
sigma=sigma,
censoring=0.0,
rng=rng,
)
sem_tobit = SEMTobit(
y=sem_data["y"],
X=sem_data["X"],
W=sem_data["W_graph"],
censoring=0.0,
)
sem_tobit.fit(draws=300, tune=300, chains=2, random_seed=42, progressbar=False)
(
sar_tobit.summary(var_names=["rho", "beta", "sigma"]),
sem_tobit.summary(var_names=["lam", "beta", "sigma"]),
)
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [rho, beta, sigma, y_cens_gap]
Sampling 2 chains for 300 tune and 300 draw iterations (600 + 600 draws total) took 6 seconds.
We recommend running at least 4 chains for robust computation of convergence diagnostics
The rhat statistic is larger than 1.01 for some parameters. This indicates problems during sampling. See https://arxiv.org/abs/1903.08008 for details
The effective sample size per chain is smaller than 100 for some parameters. A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [lam, beta, sigma, y_cens_gap]
Sampling 2 chains for 300 tune and 300 draw iterations (600 + 600 draws total) took 6 seconds.
We recommend running at least 4 chains for robust computation of convergence diagnostics
( mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail \
rho 0.465 0.082 0.317 0.614 0.004 0.003 341.0 436.0
x0 0.750 0.222 0.348 1.156 0.012 0.008 353.0 313.0
x1 1.641 0.173 1.343 1.992 0.010 0.007 309.0 303.0
sigma 0.821 0.143 0.567 1.055 0.008 0.008 403.0 466.0
r_hat
rho 1.01
x0 1.01
x1 1.00
sigma 1.00 ,
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail \
lam 0.439 0.144 0.170 0.702 0.007 0.006 462.0 422.0
x0 1.443 0.205 1.048 1.812 0.010 0.009 409.0 366.0
x1 1.212 0.091 1.055 1.411 0.005 0.004 375.0 337.0
sigma 0.552 0.087 0.419 0.733 0.004 0.003 454.0 442.0
r_hat
lam 1.01
x0 1.01
x1 1.01
sigma 1.00 )