Bayesian Spatial Flow (Origin–Destination) Models¶
This notebook demonstrates the cross-sectional flow models in bayespecon:
Model |
Description |
|---|---|
|
Three free ρ parameters — destination, origin, network |
|
Constrained ρ_w = −ρ_d·ρ_o; exact eigenvalue log-det |
Both are fully Bayesian implementations of the LeSage–Fischer (2008) SAR flow framework.
Background: The SAR Flow Model¶
Let \(y\) be the \(N\)-vector of observed flows (\(N = n^2\) O-D pairs). The model is
where
capture destination, origin, and network (O-D pair) neighbourhood dependence respectively.
The reduced form is
The design matrix \(X\) follows the LeSage layout with separate destination, origin, and intra-zonal coefficient blocks:
Separable variant¶
SARFlowSeparable imposes \(\rho_w = -\rho_d \rho_o\), which makes the filter matrix factorisable as a Kronecker product:
This enables exact, O(n) log-det evaluation via eigenvalues of the small \(n\times n\) matrix \(W\).
Setup¶
import warnings
import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from bayespecon import SARFlow, SARFlowSeparable, flow_design_matrix
from bayespecon.dgp.flows import generate_flow_data, generate_flow_data_separable
warnings.filterwarnings("ignore")
az.style.use("arviz-white")
rng = np.random.default_rng(42)
1 Simulated Data¶
We simulate flow data on \(n = 12\) spatial units, giving \(N = 144\) O-D pairs.
When neither G nor gdf is supplied, generate_flow_data
synthesises a point grid, builds a row-standardised KNN graph
(knn_k=4 by default) and computes the pairwise distance matrix used
for the gravity-style log_distance regressor.
True parameters
Parameter |
Value |
|---|---|
ρ_d |
0.35 |
ρ_o |
0.25 |
ρ_w |
0.10 |
β_d |
[1.0, −0.5] |
β_o |
[0.5, 0.3] |
σ |
1.0 |
γ_dist (log_distance coef.) |
−0.5 (default) |
# True parameters (matching the table above)
RHO_D = 0.35
RHO_O = 0.25
RHO_W = 0.10
BETA_D = [1.0, -0.5]
BETA_O = [0.5, 0.3]
SIGMA = 1.0
# Generate synthetic flow data (auto-builds KNN graph + distance matrix)
data = generate_flow_data(
n=12,
rho_d=RHO_D,
rho_o=RHO_O,
rho_w=RHO_W,
beta_d=BETA_D,
beta_o=BETA_O,
sigma=SIGMA,
seed=42,
)
n = 12
N = n * n
G = data["G"]
gdf = data["gdf"]
print(f"Generated {N} O-D flows on {n} spatial units")
print(f"Design matrix: {data['X'].shape}, columns: {data['col_names']}")
print(f"Design: k_d={data['design'].k_d}, k_o={data['design'].k_o}")
Generated 144 O-D flows on 12 spatial units
Design matrix: (144, 9), columns: ['intercept', 'intra_indicator', 'dest_x0', 'dest_x1', 'orig_x0', 'orig_x1', 'intra_x0', 'intra_x1', 'log_distance']
Design: k_d=2, k_o=2
y_obs = data["y_vec"] # observed (positive) flows
y_vec = np.log(y_obs) # latent SAR scale (DGP default is lognormal)
X = data["X"] # (N, p) design matrix (incl. log_distance column)
col_names = data["col_names"]
G = data["G"] # row-standardised KNN graph synthesised by the DGP
gdf = data["gdf"] # synthetic point GeoDataFrame
print(f"Flow observations: N = {N} ({n}×{n})")
print(f"Design matrix shape: {X.shape}")
print(f"Column names: {col_names}")
print(f"Last column ({col_names[-1]}) coefficient (gamma_dist) = {data['gamma_dist']}")
print(
f"\ny (observed) summary: min={y_obs.min():.2f} mean={y_obs.mean():.2f} max={y_obs.max():.2f}"
)
print(
f"y (log scale) summary: min={y_vec.min():.2f} mean={y_vec.mean():.2f} max={y_vec.max():.2f}"
)
Flow observations: N = 144 (12×12)
Design matrix shape: (144, 9)
Column names: ['intercept', 'intra_indicator', 'dest_x0', 'dest_x1', 'orig_x0', 'orig_x1', 'intra_x0', 'intra_x1', 'log_distance']
Last column (log_distance) coefficient (gamma_dist) = -0.5
y (observed) summary: min=0.02 mean=1.65 max=24.26
y (log scale) summary: min=-4.08 mean=-0.54 max=3.19
Note — lognormal default (since v0.2).
generate_flow_datanow returns strictly-positive flows by default, drawn from \(y = \exp(\eta)\) where \(\eta = A^{-1}(X\beta + \varepsilon)\) is the latent SAR-filtered linear predictor (also exposed asdata["eta_vec"]). The Gaussian-likelihood flow models in this notebook (OLSFlow,SARFlow,SARFlowSeparable) operate on the latent scale, so we fit onnp.log(data["y_vec"]). Passdistribution="normal"to recover the legacy Gaussian-on-y behaviour.
# Visualise the flow matrix (observed scale) and the latent log-scale histogram
fig, axes = plt.subplots(1, 2, figsize=(10, 4))
im = axes[0].imshow(y_obs.reshape(n, n), cmap="YlOrRd")
axes[0].set_title("Simulated flow matrix $Y$ (origin × destination)")
axes[0].set_xlabel("Destination")
axes[0].set_ylabel("Origin")
plt.colorbar(im, ax=axes[0])
axes[1].hist(y_vec, bins=30, edgecolor="white")
axes[1].set_title(r"Distribution of $\log y_{ij}$ (latent SAR scale)")
axes[1].set_xlabel("log flow")
axes[1].set_ylabel("Count")
# plt.tight_layout()
plt.show()
2 SARFlow: Three-parameter model¶
SARFlow places a Dirichlet prior on \((ρ_d, ρ_o, ρ_w)\) that enforces positivity and the stability constraint \(\rho_d + \rho_o + \rho_w \leq 1\) exactly, using the stick-breaking transformation for NUTS efficiency.
The log-determinant \(\log|A|\) is evaluated via the Barry–Pace trace-stochastic method, which requires only sparse matrix–vector products.
sar_flow = SARFlow(
y_vec,
G,
X,
col_names=col_names,
logdet_method="traces", # Barry-Pace stochastic trace (default)
restrict_positive=True, # Dirichlet stability prior
miter=20, # trace polynomial order (increase for better accuracy)
trace_seed=0,
)
idata_sar = sar_flow.fit(
draws=1000,
tune=1000,
chains=4,
target_accept=0.9,
random_seed=42,
progressbar=True,
)
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 2 jobs)
NUTS: [rho_simplex, beta, sigma]
Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 51 seconds.
# Posterior summary for the spatial autoregressive parameters
summary_rho = sar_flow.summary(var_names=["rho_d", "rho_o", "rho_w"])
print("=== SARFlow: spatial autoregressive parameters ===")
print(summary_rho.to_string())
# Compare to true values
print(f"\nTrue values: rho_d={RHO_D} rho_o={RHO_O} rho_w={RHO_W}")
=== SARFlow: spatial autoregressive parameters ===
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
rho_d 0.129 0.086 0.000 0.277 0.002 0.001 2403.0 1642.0 1.0
rho_o 0.121 0.082 0.000 0.265 0.002 0.001 2364.0 1668.0 1.0
rho_w 0.304 0.147 0.007 0.541 0.002 0.002 3308.0 2054.0 1.0
True values: rho_d=0.35 rho_o=0.25 rho_w=0.1
# Full coefficient summary
summary_beta = sar_flow.summary(var_names=["beta", "sigma"])
print("=== SARFlow: regression coefficients ===")
print(summary_beta.to_string())
=== SARFlow: regression coefficients ===
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
beta[intercept] -0.161 0.332 -0.790 0.440 0.006 0.005 2857.0 2845.0 1.0
beta[intra_indicator] 0.097 0.465 -0.677 1.062 0.009 0.007 2790.0 2286.0 1.0
beta[dest_x0] 1.126 0.157 0.831 1.418 0.003 0.003 2815.0 2620.0 1.0
beta[dest_x1] -0.354 0.131 -0.595 -0.105 0.002 0.002 3254.0 3060.0 1.0
beta[orig_x0] 0.811 0.143 0.543 1.083 0.003 0.002 2944.0 2514.0 1.0
beta[orig_x1] 0.368 0.131 0.105 0.595 0.002 0.002 3371.0 2800.0 1.0
beta[intra_x0] -0.397 0.437 -1.240 0.392 0.008 0.007 3205.0 2829.0 1.0
beta[intra_x1] -0.184 0.434 -1.015 0.588 0.007 0.006 3486.0 2989.0 1.0
beta[log_distance] -0.540 0.350 -1.157 0.157 0.007 0.005 2484.0 2471.0 1.0
sigma 0.929 0.058 0.832 1.047 0.001 0.001 4356.0 2704.0 1.0
# Trace plots for the three ρ parameters
az.plot_trace(
idata_sar,
var_names=["rho_d", "rho_o", "rho_w"],
figsize=(10, 6),
)
plt.suptitle("SARFlow — posterior traces", y=1.01)
plt.tight_layout()
plt.show()
# Posterior densities with true-value markers
fig, axes = plt.subplots(1, 3, figsize=(12, 3))
params = [("rho_d", RHO_D), ("rho_o", RHO_O), ("rho_w", RHO_W)]
for ax, (param, true_val) in zip(axes, params):
az.plot_posterior(
idata_sar,
var_names=[param],
ax=ax,
ref_val=true_val,
hdi_prob=0.94,
)
# ax.set_title(f"${param.replace('_', r'\\_')}$ (true = {true_val})")
plt.suptitle("SARFlow — posterior distributions (red line = true value)", y=1.02)
plt.tight_layout()
plt.show()
3 SARFlowSeparable: Constrained model¶
SARFlowSeparable imposes the constraint \(\rho_w = -\rho_d \rho_o\), yielding:
Exact log-det via eigenvalues of the small \(n \times n\) matrix \(W\) — no Monte Carlo traces needed.
Two free parameters (\(\rho_d\), \(\rho_o\)) instead of three.
Typically faster mixing because the posterior geometry is simpler.
We generate new data that satisfies the constraint (\(\rho_w = -\rho_d\rho_o = -0.12\)) to give the model the best chance at recovery.
# True parameters for the separable model (asymmetric for identifiability)
RHO_D_SEP = 0.40
RHO_O_SEP = 0.30
RHO_W_SEP = -RHO_D_SEP * RHO_O_SEP # = -0.12 (constraint)
print(
f"Separable true values: rho_d={RHO_D_SEP} rho_o={RHO_O_SEP}"
f" rho_w=-rho_d*rho_o={RHO_W_SEP:.4f}"
)
data_sep = generate_flow_data_separable(
n,
G,
rho_d=RHO_D_SEP,
rho_o=RHO_O_SEP,
beta_d=BETA_D,
beta_o=BETA_O,
sigma=SIGMA,
seed=7,
)
y_sep = np.log(data_sep["y_vec"]) # latent SAR scale (lognormal default)
X_sep = data_sep["X"]
cn_sep = data_sep["col_names"]
print(
f"\ny_sep summary: min={y_sep.min():.2f} mean={y_sep.mean():.2f} max={y_sep.max():.2f}"
)
Separable true values: rho_d=0.4 rho_o=0.3 rho_w=-rho_d*rho_o=-0.1200
y_sep summary: min=-8.22 mean=-3.04 max=1.45
sep_flow = SARFlowSeparable(
y_sep,
G,
X_sep,
col_names=cn_sep,
# logdet_method="eigenvalue" is the default for SARFlowSeparable
trace_seed=0,
)
idata_sep = sep_flow.fit(
draws=1000,
tune=1000,
chains=4,
target_accept=0.9,
random_seed=42,
progressbar=True,
)
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 2 jobs)
NUTS: [rho_d, rho_o, beta, sigma]
Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 19 seconds.
summary_sep = sep_flow.summary(var_names=["rho_d", "rho_o"])
print("=== SARFlowSeparable: spatial autoregressive parameters ===")
print(summary_sep.to_string())
print(
f"\nTrue values: rho_d={RHO_D_SEP} rho_o={RHO_O_SEP}"
f" (rho_w = -rho_d*rho_o = {RHO_W_SEP:.4f} is derived)"
)
=== SARFlowSeparable: spatial autoregressive parameters ===
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
rho_d 0.174 0.106 -0.035 0.368 0.002 0.002 2965.0 2628.0 1.0
rho_o 0.314 0.098 0.143 0.509 0.002 0.002 2430.0 2454.0 1.0
True values: rho_d=0.4 rho_o=0.3 (rho_w = -rho_d*rho_o = -0.1200 is derived)
az.plot_trace(
idata_sep,
var_names=["rho_d", "rho_o"],
figsize=(10, 4),
)
plt.suptitle("SARFlowSeparable — posterior traces", y=1.01)
plt.tight_layout()
plt.show()
fig, axes = plt.subplots(1, 2, figsize=(8, 3))
for ax, (param, true_val) in zip(axes, [("rho_d", RHO_D_SEP), ("rho_o", RHO_O_SEP)]):
az.plot_posterior(
idata_sep, var_names=[param], ax=ax, ref_val=true_val, hdi_prob=0.94
)
ax.set_title(f"${param.replace('_', r'_')}$ (true = {true_val})")
plt.suptitle("SARFlowSeparable — posterior distributions", y=1.02)
plt.tight_layout()
plt.show()
4 Design Matrix Details¶
The flow_design_matrix helper builds the standard LeSage O-D design matrix with destination, origin, and intra-zonal coefficient blocks. Each column of the regional attribute matrix \(X\) (shape \(n \times k\)) produces three columns in the full design matrix: one for destination effects, one for origin effects, and one for intra-zonal effects.
Column group |
Construction |
Interpretation |
|---|---|---|
intercept |
\(\mathbf{1}_N\) |
global mean flow |
intra_indicator |
\(\text{vec}(I_n)\) |
intra-zonal dummy |
dest_* |
\(\iota_n \otimes X\) |
destination characteristics |
orig_* |
\(X \otimes \iota_n\) |
origin characteristics |
intra_* |
\(\text{diag}(I_n) \cdot (I_n \otimes X)\) |
intra-zonal attr. |
log_distance |
\(\log(1 + d_{ij})\) |
gravity distance decay |
By default the DGP appends a log_distance column built from
\(\log(1 + d_{ij})\) and assigns it the coefficient gamma_dist=-0.5
(set gamma_dist=0.0 to neutralise the effect). When you build a
design matrix by hand via flow_design_matrix, pass dist=...
together with log_distance=True to include the same column.
You can also pass a pre-built pd.DataFrame directly as X — column names are inferred automatically.
# Build a design matrix from scratch using regional attributes
X_regional = rng.standard_normal((n, 2)) # n × k attribute matrix
dm = flow_design_matrix(X_regional, col_names=["income", "pop"])
print(f"Regional attribute matrix: {X_regional.shape} (n × k)")
print(f"Full O-D design matrix: {dm.combined.shape} (N × p)")
print(f"\nColumn names:\n {dm.feature_names}")
pd.DataFrame(dm.combined[:6], columns=dm.feature_names).round(3)
Regional attribute matrix: (12, 2) (n × k)
Full O-D design matrix: (144, 8) (N × p)
Column names:
['intercept', 'intra_indicator', 'dest_income', 'dest_pop', 'orig_income', 'orig_pop', 'intra_income', 'intra_pop']
| intercept | intra_indicator | dest_income | dest_pop | orig_income | orig_pop | intra_income | intra_pop | |
|---|---|---|---|---|---|---|---|---|
| 0 | 1.0 | 1.0 | 0.305 | -1.040 | 0.305 | -1.04 | 0.305 | -1.04 |
| 1 | 1.0 | 0.0 | 0.750 | 0.941 | 0.305 | -1.04 | 0.000 | 0.00 |
| 2 | 1.0 | 0.0 | -1.951 | -1.302 | 0.305 | -1.04 | -0.000 | -0.00 |
| 3 | 1.0 | 0.0 | 0.128 | -0.316 | 0.305 | -1.04 | 0.000 | -0.00 |
| 4 | 1.0 | 0.0 | -0.017 | -0.853 | 0.305 | -1.04 | -0.000 | -0.00 |
| 5 | 1.0 | 0.0 | 0.879 | 0.778 | 0.305 | -1.04 | 0.000 | 0.00 |
5 Spatial Effects¶
The flow SAR model has a rich effects decomposition due to
:cite:t:thomasAgnan2014SpatialEconometric (chapter 83, §83.5). For each
regional predictor \(p\), a unit shock to \(X_d^{(p)}\) (destination side) and a
unit shock to \(X_o^{(p)}\) (origin side) propagate through the spatial filter
\(A = I_N - \rho_d W_d - \rho_o W_o - \rho_w W_w\) to produce five scalar
summaries (averaged across the \(n\) perturbed regions):
Effect |
Symbol |
Cells aggregated |
|---|---|---|
Origin |
OE |
flows whose origin matches the perturbed region |
Destination |
DE |
flows whose destination matches the perturbed region |
Intra |
IE |
the diagonal \((j, j)\) flow of the perturbed region |
Network |
NE |
all remaining flows |
Total |
TE |
OE + DE + IE + NE |
The high-level wrapper model.spatial_effects() returns a tidy DataFrame.
Three modes are supported:
mode="auto"(default) — combined (sum of dest+orig) when \(X_o = X_d\), otherwise separate.mode="combined"— always sum dest and orig contributions.mode="separate"— always report both sides (Thomas-Agnan §83.5.2).
When intra_* columns are present the destination shock also carries
\(\beta_\text{intra}\) at the diagonal cell, since the design matrix sets
X_intra = intra_indicator · X_dest.
# Combined effects (default for symmetric Xo == Xd designs).
effects_df = sar_flow.spatial_effects(mode="combined")
effects_df.round(4)
| mean | ci_lower | ci_upper | bayes_pvalue | |||
|---|---|---|---|---|---|---|
| predictor | side | effect | ||||
| x0 | combined | origin | 1.0503 | 0.6952 | 1.5691 | 0.0000 |
| x1 | combined | origin | 0.3848 | 0.0985 | 0.6795 | 0.0120 |
| x0 | combined | destination | 1.3697 | 1.0426 | 1.8541 | 0.0000 |
| x1 | combined | destination | -0.3645 | -0.6499 | -0.0598 | 0.0245 |
| x0 | combined | intra | 0.1684 | 0.0958 | 0.2441 | 0.0000 |
| x1 | combined | intra | -0.0137 | -0.0794 | 0.0549 | 0.6860 |
| x0 | combined | network | 2.2972 | 0.4723 | 6.7456 | 0.0000 |
| x1 | combined | network | 0.0436 | -0.4727 | 0.7376 | 0.9995 |
| x0 | combined | total | 4.8856 | 2.5974 | 10.2205 | 0.0000 |
| x1 | combined | total | 0.0503 | -0.8938 | 1.2163 | 0.9835 |
# Small sampler settings used for the demo fits below.
SAMPLE_KWARGS_DEMO = dict(draws=200, tune=200, chains=2,
random_seed=0, progressbar=False)
5.1 Asymmetric origin / destination attributes¶
When destination and origin attribute matrices have different numbers of columns (k_d ≠ k_o), pass beta_d and beta_o with different lengths to generate_flow_data. The design matrix layout becomes:
intercept | intra_indicator | dest_*(k_d) | orig_*(k_o) | intra_*(k_d) | dist
The model auto-detects k_d and k_o from column name prefixes (dest_* vs orig_*). spatial_effects() reports destination and origin effects separately when k_d ≠ k_o, since the predictors are different variables.
from bayespecon.dgp.flows import generate_flow_data
# Asymmetric: 2 destination attributes, 1 origin attribute
data_asym = generate_flow_data(
n=5,
rho_d=0.2, rho_o=0.15, rho_w=0.05,
beta_d=[1.0, -0.5], # k_d = 2
beta_o=[0.5], # k_o = 1
sigma=0.5, seed=11,
)
G_asym = data_asym["G"]
print(f"k_d={data_asym['design'].k_d}, k_o={data_asym['design'].k_o}")
print(f"Columns: {data_asym['col_names']}")
model_asym = SARFlow(
np.log(data_asym["y_vec"]), G_asym, data_asym["X"],
col_names=data_asym["col_names"],
)
print("symmetric_xo_xd =", model_asym._symmetric_xo_xd)
model_asym.fit(**SAMPLE_KWARGS_DEMO)
model_asym.spatial_effects().round(4) # auto -> separate
k_d=2, k_o=1
Columns: ['intercept', 'intra_indicator', 'dest_x0', 'dest_x1', 'orig_y0', 'intra_x0', 'intra_x1', 'log_distance']
symmetric_xo_xd = False
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [rho_simplex, beta, sigma]
Sampling 2 chains for 200 tune and 200 draw iterations (400 + 400 draws total) took 18 seconds.
We recommend running at least 4 chains for robust computation of convergence diagnostics
The rhat statistic is larger than 1.01 for some parameters. This indicates problems during sampling. See https://arxiv.org/abs/1903.08008 for details
The effective sample size per chain is smaller than 100 for some parameters. A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details
| mean | ci_lower | ci_upper | bayes_pvalue | |||
|---|---|---|---|---|---|---|
| predictor | side | effect | ||||
| x0 | dest | origin | 0.4821 | 0.0205 | 2.1100 | 0.000 |
| x1 | dest | origin | -0.2589 | -1.1976 | -0.0098 | 0.000 |
| x0 | dest | destination | 1.2029 | 0.5060 | 2.7569 | 0.000 |
| x1 | dest | destination | -0.7665 | -1.7403 | -0.4156 | 0.000 |
| x0 | dest | intra | 0.3384 | 0.1113 | 0.7707 | 0.005 |
| x1 | dest | intra | -0.1362 | -0.3784 | -0.0168 | 0.020 |
| x0 | dest | network | 1.8624 | 0.0868 | 8.3674 | 0.000 |
| x1 | dest | network | -1.1366 | -4.9222 | -0.0589 | 0.000 |
| x0 | dest | total | 3.8859 | 0.9522 | 14.1595 | 0.000 |
| x1 | dest | total | -2.2982 | -8.2249 | -0.6393 | 0.000 |
| y0 | orig | origin | 0.7337 | 0.2069 | 1.4377 | 0.010 |
| destination | 0.1827 | 0.0069 | 0.7493 | 0.010 | ||
| intra | 0.1834 | 0.0517 | 0.3594 | 0.010 | ||
| network | 0.7309 | 0.0277 | 2.9970 | 0.010 | ||
| total | 1.8308 | 0.3347 | 5.3013 | 0.010 |
# You can also build asymmetric designs directly with flow_design_matrix_asymmetric
from bayespecon.graph import flow_design_matrix_asymmetric
rng_asym = np.random.default_rng(7)
n_demo = 5
Xd = rng_asym.standard_normal((n_demo, 2)) # 2 destination variables
Xo = rng_asym.standard_normal((n_demo, 1)) # 1 origin variable
dm = flow_design_matrix_asymmetric(Xd, Xo)
print(f"Design shape: {dm.combined.shape}")
print(f"k_d={dm.k_d}, k_o={dm.k_o}")
print(f"Feature names: {dm.feature_names}")
Design shape: (25, 7)
k_d=2, k_o=1
Feature names: ['intercept', 'intra_indicator', 'dest_x0', 'dest_x1', 'orig_y0', 'intra_x0', 'intra_x1']
5.2 Non-spatial OLS gravity baseline (OLSFlow)¶
OLSFlow is the conventional log-linear gravity model
(Thomas-Agnan & LeSage 2014, eq. 83.2): no spatial-lag terms, just
\(y = X \beta + \varepsilon\) with iid Gaussian errors. It uses the
same spatial_effects() API and reproduces the closed-form
expressions in Table 83.1 (with \(A = I_N\)):
from bayespecon import OLSFlow
ols_flow = OLSFlow(y_vec, G, X, col_names=col_names)
ols_flow.fit(**SAMPLE_KWARGS_DEMO)
ols_flow.spatial_effects(mode="combined").round(4)
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [beta, sigma]
Sampling 2 chains for 200 tune and 200 draw iterations (400 + 400 draws total) took 5 seconds.
We recommend running at least 4 chains for robust computation of convergence diagnostics
The rhat statistic is larger than 1.01 for some parameters. This indicates problems during sampling. See https://arxiv.org/abs/1903.08008 for details
The effective sample size per chain is smaller than 100 for some parameters. A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details
| mean | ci_lower | ci_upper | bayes_pvalue | |||
|---|---|---|---|---|---|---|
| predictor | side | effect | ||||
| x0 | combined | origin | 0.8135 | 0.5986 | 1.0445 | 0.000 |
| x1 | combined | origin | 0.2887 | 0.0519 | 0.5308 | 0.010 |
| x0 | combined | destination | 1.2213 | 1.0018 | 1.4380 | 0.000 |
| x1 | combined | destination | -0.4622 | -0.7067 | -0.2168 | 0.000 |
| x0 | combined | intra | 0.1431 | 0.0657 | 0.2103 | 0.000 |
| x1 | combined | intra | -0.0229 | -0.0840 | 0.0452 | 0.495 |
| x0 | combined | network | 0.0000 | 0.0000 | 0.0000 | 0.000 |
| x1 | combined | network | 0.0000 | 0.0000 | 0.0000 | 0.000 |
| x0 | combined | total | 2.1779 | 1.8483 | 2.4992 | 0.000 |
| x1 | combined | total | -0.1964 | -0.5958 | 0.1888 | 0.310 |
6 MCMC Diagnostics¶
Good practice after any Bayesian fit: check \(\hat{R}\) (should be \(< 1.01\)) and effective sample size (ESS > 400 per chain).
for label, idata in [("SARFlow", idata_sar), ("SARFlowSeparable", idata_sep)]:
diag = az.summary(idata, var_names=["rho_d", "rho_o"], stat_focus="mean")
rhat_ok = (diag["r_hat"] < 1.01).all()
ess_ok = (diag["ess_bulk"] > 400).all()
print(f"{label:25s} r_hat<1.01: {rhat_ok} ess_bulk>400: {ess_ok}")
print(diag[["mean", "sd", "hdi_3%", "hdi_97%", "ess_bulk", "r_hat"]].to_string())
print()
SARFlow r_hat<1.01: True ess_bulk>400: True
mean sd hdi_3% hdi_97% ess_bulk r_hat
rho_d 0.129 0.086 0.0 0.277 2403.0 1.0
rho_o 0.121 0.082 0.0 0.265 2364.0 1.0
SARFlowSeparable r_hat<1.01: True ess_bulk>400: True
mean sd hdi_3% hdi_97% ess_bulk r_hat
rho_d 0.174 0.106 -0.035 0.368 2965.0 1.0
rho_o 0.314 0.098 0.143 0.509 2430.0 1.0
Spatial-parameter adequacy¶
Flow models have three spatial scalars (\(\rho_d\), \(\rho_o\), \(\rho_w\)),
and each is prone to the slow-mixing behaviour discussed in Wolf,
Anselin & Arribas-Bel (2018) [Wolf et al., 2018].
The spatial_mcmc_diagnostic helper auto-detects all three and reports
ESS, sampler yield, \(\hat{R}\), and HPDI stability.
from bayespecon import spatial_mcmc_diagnostic
spatial_mcmc_diagnostic(sar_flow, emit_warnings=False).to_frame()
| ess_bulk | ess_tail | r_hat | mcse_mean | yield_pct | hpdi_drift_pct | adequate | |
|---|---|---|---|---|---|---|---|
| parameter | |||||||
| rho_d | 2403.301734 | 1642.150753 | 1.001299 | 0.001630 | 60.082543 | 4.102400 | True |
| rho_o | 2364.052318 | 1668.068665 | 1.001046 | 0.001575 | 59.101308 | 4.132537 | True |
| rho_w | 3308.179383 | 2053.519490 | 1.000170 | 0.002483 | 82.704485 | 3.997955 | True |
# Energy plot — checks that HMC is exploring the posterior efficiently
fig, axes = plt.subplots(1, 2, figsize=(10, 3))
az.plot_energy(idata_sar, ax=axes[0])
axes[0].set_title("SARFlow energy")
az.plot_energy(idata_sep, ax=axes[1])
axes[1].set_title("SARFlowSeparable energy")
plt.tight_layout()
plt.show()
7 Model Comparison¶
The WAIC / LOO-CV scores can be used to compare SARFlow and SARFlowSeparable when both are estimated on the same data. A lower ELPD (expected log pointwise predictive density) indicates a worse-fitting model.
Note: here we fit
SARFlowto the separable data to make the comparison meaningful — the separable model is nested in the unrestricted one.
# Fit unrestricted SARFlow on the same separable data for a fair comparison
sar_flow_on_sep = SARFlow(
y_sep,
G,
X_sep,
col_names=cn_sep,
logdet_method="traces",
miter=20,
trace_seed=0,
)
idata_sar_on_sep = sar_flow_on_sep.fit(
draws=1000,
tune=1000,
chains=4,
target_accept=0.9,
random_seed=42,
progressbar=True,
)
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 2 jobs)
NUTS: [rho_simplex, beta, sigma]
Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 92 seconds.
# LOO-CV comparison (requires log-likelihood in idata; falls back to WAIC if unavailable)
try:
loo_sar = az.loo(idata_sar_on_sep)
loo_sep = az.loo(idata_sep)
comparison = az.compare(
{"SARFlow (3 ρ)": idata_sar_on_sep, "SARFlowSeparable": idata_sep}
)
print("LOO-CV model comparison:")
print(
comparison[
["elpd_loo", "p_loo", "elpd_diff", "weight", "se", "warning"]
].to_string()
)
except Exception as e:
print(f"LOO not available (log-likelihood not stored): {e}")
print("Tip: pass compute_log_likelihood=True to fit() to enable LOO/WAIC.")
LOO-CV model comparison:
elpd_loo p_loo elpd_diff weight se warning
SARFlowSeparable -195.044604 11.321569 0.000000 1.0 8.482200 False
SARFlow (3 ρ) -195.511241 11.185733 0.466637 0.0 8.521106 False
8 Prior Sensitivity and Stability Constraint¶
By default SARFlow uses a Dirichlet prior that forces \(\rho_d, \rho_o, \rho_w \geq 0\) and \(\rho_d + \rho_o + \rho_w \leq 1\). Setting restrict_positive=False allows negative spillovers via independent Uniform(-1, 1) priors plus a differentiable stability wall potential.
Use restrict_positive=False when competitive effects (e.g. negative network parameter) are theoretically expected.
# Fit with restrict_positive=False to allow negative rho values
sar_flow_neg = SARFlow(
y_vec,
G,
X,
col_names=col_names,
logdet_method="traces",
restrict_positive=False, # Uniform(-1,1) priors + stability potential
miter=20,
trace_seed=0,
)
idata_neg = sar_flow_neg.fit(
draws=800,
tune=1000,
chains=4,
target_accept=0.95,
random_seed=42,
progressbar=True,
)
summary_neg = sar_flow_neg.summary()
print("=== SARFlow (restrict_positive=False) ===")
print(summary_neg[["mean", "sd", "hdi_3%", "hdi_97%", "r_hat"]].to_string())
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 2 jobs)
NUTS: [rho_d, rho_o, rho_w, beta, sigma]
Sampling 4 chains for 1_000 tune and 800 draw iterations (4_000 + 3_200 draws total) took 48 seconds.
There were 2 divergences after tuning. Increase `target_accept` or reparameterize.
=== SARFlow (restrict_positive=False) ===
mean sd hdi_3% hdi_97% r_hat
rho_d 0.087 0.123 -0.127 0.333 1.0
rho_o 0.056 0.126 -0.184 0.283 1.0
rho_w 0.365 0.182 0.025 0.716 1.0
beta[intercept] -0.167 0.345 -0.812 0.478 1.0
beta[intra_indicator] 0.057 0.471 -0.836 0.911 1.0
beta[dest_x0] 1.201 0.187 0.866 1.553 1.0
beta[dest_x1] -0.377 0.134 -0.635 -0.137 1.0
beta[orig_x0] 0.843 0.166 0.522 1.146 1.0
beta[orig_x1] 0.382 0.140 0.114 0.645 1.0
beta[intra_x0] -0.404 0.447 -1.218 0.466 1.0
beta[intra_x1] -0.195 0.439 -1.029 0.610 1.0
beta[log_distance] -0.584 0.363 -1.196 0.140 1.0
sigma 0.933 0.057 0.825 1.037 1.0
10 Bayesian LM Diagnostics¶
Once a baseline gravity model has been fit, the Bayesian Lagrange-multiplier
(LM) tests in bayespecon.diagnostics indicate which spatial-lag directions
(destination, origin, network) are worth adding — and which intra-block
columns deserve their own coefficients. They are the OD-flow analogues of the
classic Anselin/Koley–Bera LM family, ported to the posterior-predictive
score formulation of Doğan, Taşpınar & Bera (2021).
For each posterior draw \(g\) from a fitted null model with residuals \(e_g = y - X\beta_g\), the score for direction \(i \in \{d, o, w\}\) is
and the information matrix uses the cached Kronecker-trace block
\(T_{\text{flow}}\) (computed in \(\mathcal{O}(\mathrm{nnz})\) from
flow_trace_blocks(W)):
The marginal statistic is \(LM_g = (s_g^{(i)})^2 / J_{ii}\) (\(\chi^2_1\)); the joint statistic is \(g_g^\top J^{-1} g_g\) (\(\chi^2_3\)). The Bayesian \(p\)-value reports \(1 - F_{\chi^2_{\text{df}}}(\overline{LM})\).
from bayespecon import (
bayesian_lm_flow_dest_test,
bayesian_lm_flow_orig_test,
bayesian_lm_flow_network_test,
bayesian_lm_flow_joint_test,
bayesian_lm_flow_intra_test,
)
# All marginal + joint + intra tests, evaluated at the OLSFlow null draws.
flow_lm_tests = {
"rho_d (dest)": bayesian_lm_flow_dest_test(ols_flow),
"rho_o (orig)": bayesian_lm_flow_orig_test(ols_flow),
"rho_w (network)": bayesian_lm_flow_network_test(ols_flow),
"joint (d,o,w)": bayesian_lm_flow_joint_test(ols_flow),
"intra block": bayesian_lm_flow_intra_test(ols_flow),
}
import pandas as pd
pd.DataFrame(
[(k, r.df, r.mean, r.bayes_pvalue) for k, r in flow_lm_tests.items()],
columns=["test", "df", "LM mean", "Bayes p"],
).round(4)
| test | df | LM mean | Bayes p | |
|---|---|---|---|---|
| 0 | rho_d (dest) | 1 | 1.3262 | 0.2495 |
| 1 | rho_o (orig) | 1 | 1.1684 | 0.2797 |
| 2 | rho_w (network) | 1 | 2.9788 | 0.0844 |
| 3 | joint (d,o,w) | 3 | 3.9794 | 0.2637 |
| 4 | intra block | 3 | 2.6648 | 0.4462 |
10.1 Robust (Neyman-orthogonal) tests¶
The marginal LM tests assume the other two ρ parameters are zero under the null. When that assumption is in doubt, evaluate the score and information at the alternative SARFlow posterior and apply the Neyman-orthogonal adjustment
where \(\nu\) indexes the two nuisance directions. The result is robust to local misspecification of the nuisance ρ’s (Doğan et al. 2021, Proposition 3).
from bayespecon import (
bayesian_robust_lm_flow_dest_test,
bayesian_robust_lm_flow_orig_test,
bayesian_robust_lm_flow_network_test,
)
robust_tests = {
"rho_d | (rho_o, rho_w)": bayesian_robust_lm_flow_dest_test(sar_flow),
"rho_o | (rho_d, rho_w)": bayesian_robust_lm_flow_orig_test(sar_flow),
"rho_w | (rho_d, rho_o)": bayesian_robust_lm_flow_network_test(sar_flow),
}
pd.DataFrame(
[(k, r.mean, r.bayes_pvalue) for k, r in robust_tests.items()],
columns=["test", "LM* mean", "Bayes p"],
).round(4)
| test | LM* mean | Bayes p | |
|---|---|---|---|
| 0 | rho_d | (rho_o, rho_w) | 0.5422 | 0.4615 |
| 1 | rho_o | (rho_d, rho_w) | 0.4449 | 0.5047 |
| 2 | rho_w | (rho_d, rho_o) | 0.5610 | 0.4538 |
10.2 Panel analogues¶
For a fitted OLSFlowPanel,
the same tests are available with the bayesian_panel_lm_flow_* prefix.
Scores accumulate over the demeaned panel stack of length \(n^2 \cdot T\)
and the information matrix scales the trace block by \(T\) to reflect i.i.d.
within-period contributions under \(H_0\):
from bayespecon import (
bayesian_panel_lm_flow_dest_test,
bayesian_panel_lm_flow_orig_test,
bayesian_panel_lm_flow_network_test,
bayesian_panel_lm_flow_joint_test,
bayesian_panel_lm_flow_intra_test,
)
11 Quick Reference¶
from bayespecon import OLSFlow, SARFlow, SARFlowSeparable
from bayespecon import flow_design_matrix
from bayespecon.dgp.flows import generate_flow_data, generate_flow_data_separable
# Build design matrix from regional attributes
dm = flow_design_matrix(X_regional, col_names=["income", "pop"])
# --- SARFlow (three free ρ parameters) ---
model = SARFlow(
y, G, dm.combined,
col_names=dm.feature_names,
logdet_method="traces", # Barry-Pace stochastic traces (default)
restrict_positive=True, # Dirichlet stability prior
)
idata = model.fit(draws=2000, tune=1000, chains=4, random_seed=0)
model.summary(var_names=["rho_d", "rho_o", "rho_w"])
# --- SARFlowSeparable (ρ_w = -ρ_d·ρ_o, exact log-det) ---
sep_model = SARFlowSeparable(
y, G, dm.combined,
col_names=dm.feature_names,
# logdet_method="eigenvalue" is the default
)
idata_sep = sep_model.fit(draws=2000, tune=1000, chains=4, random_seed=0)
# --- OLSFlow (non-spatial gravity baseline) ---
ols = OLSFlow(y, G, dm.combined, col_names=dm.feature_names)
idata_ols = ols.fit(draws=2000, tune=1000, chains=4, random_seed=0)
ols.spatial_effects(mode="combined") # closed-form Table 83.1
# --- OLSFlow (non-spatial gravity baseline) ---
ols = OLSFlow(y, G, dm.combined, col_names=dm.feature_names)
idata_ols = ols.fit(draws=2000, tune=1000, chains=4, random_seed=0)
ols.spatial_effects(mode="combined") # closed-form Table 83.1
Key parameters¶
Parameter |
Default |
Description |
|---|---|---|
|
|
|
|
|
Dirichlet prior; set |
|
30 |
Trace polynomial order (higher = more accurate, slower precomputation) |
|
50 |
Monte Carlo probes for trace estimation |
Model choice guide¶
Data characteristic |
Recommended model |
|---|---|
No prior on sign of spatial effects |
|
Positive spillovers expected |
|
Separability plausible, fast inference needed |
|
Count/non-negative flows |
|