Nonlinear Spatial Models¶

This guide demonstrates three nonlinear spatial models in bayespecon on synthetic data.

Spatial Probit¶

For binary outcomes \(y_i \in \{0,1\}\), latent utilities \(y^*\) follow a spatial lag:

\[y^* = \rho W y^* + X\beta + a + \varepsilon, \quad \varepsilon \sim \mathcal{N}(0, I)\]

where \(a\) are region-level random effects and \(y_i = \mathbf{1}[y^*_i > 0]\).

SAR Tobit¶

For censored outcomes, the spatial autoregressive Tobit specifies:

\[y^* = \rho W y^* + X\beta + \varepsilon, \quad \varepsilon \sim \mathcal{N}(0, \sigma^2 I)\]

\[y_i = \max(c,\, y^*_i)\]

where \(c\) is the censoring threshold (default 0).

SEM Tobit¶

The spatial error Tobit places spatial dependence in the disturbance:

\[y^* = X\beta + u, \quad u = \lambda W u + \varepsilon, \quad \varepsilon \sim \mathcal{N}(0, \sigma^2 I)\]

\[y_i = \max(c,\, y^*_i)\]

import numpy as np
from libpysal.graph import Graph

from bayespecon import SARTobit, SEMTobit, SpatialProbit, dgp

rng = np.random.default_rng(1234)

# SpatialProbit
m, n_per_region = 8, 20

# Create line-graph weights: each node connected to neighbors
focal, neighbor = [], []
for i in range(m):
    if i > 0:
        focal.append(i)
        neighbor.append(i - 1)
    if i < m - 1:
        focal.append(i)
        neighbor.append(i + 1)
Wm = Graph.from_arrays(
    np.array(focal), np.array(neighbor), weight=np.ones(len(focal))
).transform("r")

rho, beta, sigma_a = 0.35, np.array([0.3, 1.0]), 0.8

sp_data = dgp.simulate_spatial_probit(
    W=Wm,
    rho=rho,
    beta=beta,
    sigma_a=sigma_a,
    n_per_region=n_per_region,
    rng=rng,
)

sp = SpatialProbit(
    y=sp_data["y"],
    X=sp_data["X"],
    W=sp_data["W_graph"],
    region_ids=sp_data["region_ids"],
)
sp.fit(draws=300, tune=300, chains=2, random_seed=42, progressbar=False)
sp.summary(var_names=["rho", "beta", "sigma_a"])

Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [rho, beta, sigma_a, a_raw]
Sampling 2 chains for 300 tune and 300 draw iterations (600 + 600 draws total) took 7 seconds.
There were 2 divergences after tuning. Increase `target_accept` or reparameterize.
We recommend running at least 4 chains for robust computation of convergence diagnostics
The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details

	mean	sd	hdi_3%	hdi_97%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
rho	0.165	0.337	-0.481	0.745	0.023	0.013	197.0	270.0	1.00
x0	0.436	1.043	-1.612	2.404	0.094	0.095	137.0	185.0	1.01
x1	1.024	0.200	0.660	1.391	0.010	0.010	380.0	325.0	1.00
sigma_a	1.841	0.589	0.857	2.934	0.043	0.039	202.0	199.0	1.00

# SARTobit and SEMTobit
n = 30

# Create line-graph weights: each node connected to neighbors
focal, neighbor = [], []
for i in range(n):
    if i > 0:
        focal.append(i)
        neighbor.append(i - 1)
    if i < n - 1:
        focal.append(i)
        neighbor.append(i + 1)
W = Graph.from_arrays(
    np.array(focal), np.array(neighbor), weight=np.ones(len(focal))
).transform("r")

beta = np.array([1.0, 1.5])
sigma, rho, lam = 0.8, 0.4, 0.35

sar_data = dgp.simulate_sar_tobit(
    W=W,
    rho=rho,
    beta=beta,
    sigma=sigma,
    censoring=0.0,
    rng=rng,
)
sar_tobit = SARTobit(
    y=sar_data["y"],
    X=sar_data["X"],
    W=sar_data["W_graph"],
    censoring=0.0,
)
sar_tobit.fit(draws=300, tune=300, chains=2, random_seed=42, progressbar=False)

sem_data = dgp.simulate_sem_tobit(
    W=W,
    lam=lam,
    beta=beta,
    sigma=sigma,
    censoring=0.0,
    rng=rng,
)
sem_tobit = SEMTobit(
    y=sem_data["y"],
    X=sem_data["X"],
    W=sem_data["W_graph"],
    censoring=0.0,
)
sem_tobit.fit(draws=300, tune=300, chains=2, random_seed=42, progressbar=False)

(
    sar_tobit.summary(var_names=["rho", "beta", "sigma"]),
    sem_tobit.summary(var_names=["lam", "beta", "sigma"]),
)

Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [rho, beta, sigma, y_cens_gap]
Sampling 2 chains for 300 tune and 300 draw iterations (600 + 600 draws total) took 7 seconds.
We recommend running at least 4 chains for robust computation of convergence diagnostics
The rhat statistic is larger than 1.01 for some parameters. This indicates problems during sampling. See https://arxiv.org/abs/1903.08008 for details
The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [lam, beta, sigma, y_cens_gap]
Sampling 2 chains for 300 tune and 300 draw iterations (600 + 600 draws total) took 7 seconds.
We recommend running at least 4 chains for robust computation of convergence diagnostics
The rhat statistic is larger than 1.01 for some parameters. This indicates problems during sampling. See https://arxiv.org/abs/1903.08008 for details
The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details

(        mean     sd  hdi_3%  hdi_97%  mcse_mean  mcse_sd  ess_bulk  ess_tail  \
 rho    0.456  0.081   0.316    0.623      0.005    0.004     296.0     325.0   
 x0     0.785  0.223   0.422    1.228      0.013    0.009     290.0     346.0   
 x1     1.643  0.172   1.295    1.963      0.010    0.009     317.0     322.0   
 sigma  0.812  0.144   0.597    1.101      0.007    0.007     459.0     357.0   
 
        r_hat  
 rho      1.0  
 x0       1.0  
 x1       1.0  
 sigma    1.0  ,
         mean     sd  hdi_3%  hdi_97%  mcse_mean  mcse_sd  ess_bulk  ess_tail  \
 lam    0.450  0.155   0.126    0.695      0.009    0.007     277.0     324.0   
 x0     1.450  0.215   1.022    1.853      0.012    0.013     350.0     259.0   
 x1     1.215  0.095   1.043    1.413      0.005    0.005     435.0     343.0   
 sigma  0.543  0.082   0.393    0.699      0.005    0.003     345.0     459.0   
 
        r_hat  
 lam     1.01  
 x0      1.01  
 x1      1.01  
 sigma   1.01  )