Bayesian Spatial Flow (Origin–Destination) Models

This notebook demonstrates the cross-sectional flow models in bayespecon:

Model

Description

SARFlow

Three free ρ parameters — destination, origin, network

SARFlowSeparable

Constrained ρ_w = −ρ_d·ρ_o; exact eigenvalue log-det

Both are fully Bayesian implementations of the LeSage–Fischer (2008) SAR flow framework.


Background: The SAR Flow Model

Let \(y\) be the \(N\)-vector of observed flows (\(N = n^2\) O-D pairs). The model is

\[ y = \rho_d W_d y + \rho_o W_o y + \rho_w W_w y + X\beta + \varepsilon, \quad \varepsilon \sim \mathcal{N}(0, \sigma^2 I_N) \]

where

\[ W_d = I_n \otimes W, \quad W_o = W \otimes I_n, \quad W_w = W \otimes W \]

capture destination, origin, and network (O-D pair) neighbourhood dependence respectively.

The reduced form is

\[ y = (I_N - \rho_d W_d - \rho_o W_o - \rho_w W_w)^{-1}(X\beta + \varepsilon) \]

The design matrix \(X\) follows the LeSage layout with separate destination, origin, and intra-zonal coefficient blocks:

\[ \beta = [\alpha,\; \alpha_i,\; \beta_d^1\ldots\beta_d^k,\; \beta_o^1\ldots\beta_o^k,\; \beta_i^1\ldots\beta_i^k] \]

Separable variant

SARFlowSeparable imposes \(\rho_w = -\rho_d \rho_o\), which makes the filter matrix factorisable as a Kronecker product:

\[ A = (I_n - \rho_d W) \otimes (I_n - \rho_o W) \implies \log|A| = n\log|I_n - \rho_d W| + n\log|I_n - \rho_o W| \]

This enables exact, O(n) log-det evaluation via eigenvalues of the small \(n\times n\) matrix \(W\).

Setup

import warnings

import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from bayespecon import SARFlow, SARFlowSeparable, flow_design_matrix
from bayespecon.dgp.flows import generate_flow_data, generate_flow_data_separable

warnings.filterwarnings("ignore")
az.style.use("arviz-white")
rng = np.random.default_rng(42)

1 Simulated Data

We simulate flow data on \(n = 12\) spatial units, giving \(N = 144\) O-D pairs. When neither G nor gdf is supplied, generate_flow_data synthesises a point grid, builds a row-standardised KNN graph (knn_k=4 by default) and computes the pairwise distance matrix used for the gravity-style log_distance regressor.

True parameters

Parameter

Value

ρ_d

0.35

ρ_o

0.25

ρ_w

0.10

β_d

[1.0, −0.5]

β_o

[0.5, 0.3]

σ

1.0

γ_dist (log_distance coef.)

−0.5 (default)

# True parameters (matching the table above)
RHO_D = 0.35
RHO_O = 0.25
RHO_W = 0.10
BETA_D = [1.0, -0.5]
BETA_O = [0.5, 0.3]
SIGMA = 1.0

# Generate synthetic flow data (auto-builds KNN graph + distance matrix)
data = generate_flow_data(
    n=12,
    rho_d=RHO_D,
    rho_o=RHO_O,
    rho_w=RHO_W,
    beta_d=BETA_D,
    beta_o=BETA_O,
    sigma=SIGMA,
    seed=42,
)

n = 12
N = n * n
G = data["G"]
gdf = data["gdf"]

print(f"Generated {N} O-D flows on {n} spatial units")
print(f"Design matrix: {data['X'].shape}, columns: {data['col_names']}")
print(f"Design: k_d={data['design'].k_d}, k_o={data['design'].k_o}")
Generated 144 O-D flows on 12 spatial units
Design matrix: (144, 9), columns: ['intercept', 'intra_indicator', 'dest_x0', 'dest_x1', 'orig_x0', 'orig_x1', 'intra_x0', 'intra_x1', 'log_distance']
Design: k_d=2, k_o=2
y_obs = data["y_vec"]              # observed (positive) flows
y_vec = np.log(y_obs)              # latent SAR scale (DGP default is lognormal)
X = data["X"]              # (N, p) design matrix (incl. log_distance column)
col_names = data["col_names"]
G = data["G"]              # row-standardised KNN graph synthesised by the DGP
gdf = data["gdf"]          # synthetic point GeoDataFrame

print(f"Flow observations: N = {N}  ({n}×{n})")
print(f"Design matrix shape: {X.shape}")
print(f"Column names: {col_names}")
print(f"Last column ({col_names[-1]}) coefficient (gamma_dist) = {data['gamma_dist']}")
print(
    f"\ny (observed) summary:  min={y_obs.min():.2f}  mean={y_obs.mean():.2f}  max={y_obs.max():.2f}"
)
print(
    f"y (log scale) summary:  min={y_vec.min():.2f}  mean={y_vec.mean():.2f}  max={y_vec.max():.2f}"
)
Flow observations: N = 144  (12×12)
Design matrix shape: (144, 9)
Column names: ['intercept', 'intra_indicator', 'dest_x0', 'dest_x1', 'orig_x0', 'orig_x1', 'intra_x0', 'intra_x1', 'log_distance']
Last column (log_distance) coefficient (gamma_dist) = -0.5

y (observed) summary:  min=0.02  mean=1.65  max=24.26
y (log scale) summary:  min=-4.08  mean=-0.54  max=3.19

Note — lognormal default (since v0.2). generate_flow_data now returns strictly-positive flows by default, drawn from \(y = \exp(\eta)\) where \(\eta = A^{-1}(X\beta + \varepsilon)\) is the latent SAR-filtered linear predictor (also exposed as data["eta_vec"]). The Gaussian-likelihood flow models in this notebook (OLSFlow, SARFlow, SARFlowSeparable) operate on the latent scale, so we fit on np.log(data["y_vec"]). Pass distribution="normal" to recover the legacy Gaussian-on-y behaviour.

# Visualise the flow matrix (observed scale) and the latent log-scale histogram
fig, axes = plt.subplots(1, 2, figsize=(10, 4))

im = axes[0].imshow(y_obs.reshape(n, n), cmap="YlOrRd")
axes[0].set_title("Simulated flow matrix $Y$ (origin × destination)")
axes[0].set_xlabel("Destination")
axes[0].set_ylabel("Origin")
plt.colorbar(im, ax=axes[0])

axes[1].hist(y_vec, bins=30, edgecolor="white")
axes[1].set_title(r"Distribution of $\log y_{ij}$ (latent SAR scale)")
axes[1].set_xlabel("log flow")
axes[1].set_ylabel("Count")

# plt.tight_layout()
plt.show()
../_images/efa240c56768ced496b09bc68753eeb36c225f7bae602673e6e5811197365ffa.png

2 SARFlow: Three-parameter model

SARFlow places a Dirichlet prior on \((ρ_d, ρ_o, ρ_w)\) that enforces positivity and the stability constraint \(\rho_d + \rho_o + \rho_w \leq 1\) exactly, using the stick-breaking transformation for NUTS efficiency.

The log-determinant \(\log|A|\) is evaluated via the Barry–Pace trace-stochastic method, which requires only sparse matrix–vector products.

sar_flow = SARFlow(
    y_vec,
    G,
    X,
    col_names=col_names,
    logdet_method="traces",  # Barry-Pace stochastic trace (default)
    restrict_positive=True,  # Dirichlet stability prior
    miter=20,  # trace polynomial order (increase for better accuracy)
    trace_seed=0,
)

idata_sar = sar_flow.fit(
    draws=1000,
    tune=1000,
    chains=4,
    target_accept=0.9,
    random_seed=42,
    progressbar=True,
)
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 2 jobs)
NUTS: [rho_simplex, beta, sigma]
Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 51 seconds.

# Posterior summary for the spatial autoregressive parameters
summary_rho = sar_flow.summary(var_names=["rho_d", "rho_o", "rho_w"])
print("=== SARFlow: spatial autoregressive parameters ===")
print(summary_rho.to_string())

# Compare to true values
print(f"\nTrue values:  rho_d={RHO_D}  rho_o={RHO_O}  rho_w={RHO_W}")
=== SARFlow: spatial autoregressive parameters ===
        mean     sd  hdi_3%  hdi_97%  mcse_mean  mcse_sd  ess_bulk  ess_tail  r_hat
rho_d  0.129  0.086   0.000    0.277      0.002    0.001    2403.0    1642.0    1.0
rho_o  0.121  0.082   0.000    0.265      0.002    0.001    2364.0    1668.0    1.0
rho_w  0.304  0.147   0.007    0.541      0.002    0.002    3308.0    2054.0    1.0

True values:  rho_d=0.35  rho_o=0.25  rho_w=0.1
# Full coefficient summary
summary_beta = sar_flow.summary(var_names=["beta", "sigma"])
print("=== SARFlow: regression coefficients ===")
print(summary_beta.to_string())
=== SARFlow: regression coefficients ===
                        mean     sd  hdi_3%  hdi_97%  mcse_mean  mcse_sd  ess_bulk  ess_tail  r_hat
beta[intercept]       -0.161  0.332  -0.790    0.440      0.006    0.005    2857.0    2845.0    1.0
beta[intra_indicator]  0.097  0.465  -0.677    1.062      0.009    0.007    2790.0    2286.0    1.0
beta[dest_x0]          1.126  0.157   0.831    1.418      0.003    0.003    2815.0    2620.0    1.0
beta[dest_x1]         -0.354  0.131  -0.595   -0.105      0.002    0.002    3254.0    3060.0    1.0
beta[orig_x0]          0.811  0.143   0.543    1.083      0.003    0.002    2944.0    2514.0    1.0
beta[orig_x1]          0.368  0.131   0.105    0.595      0.002    0.002    3371.0    2800.0    1.0
beta[intra_x0]        -0.397  0.437  -1.240    0.392      0.008    0.007    3205.0    2829.0    1.0
beta[intra_x1]        -0.184  0.434  -1.015    0.588      0.007    0.006    3486.0    2989.0    1.0
beta[log_distance]    -0.540  0.350  -1.157    0.157      0.007    0.005    2484.0    2471.0    1.0
sigma                  0.929  0.058   0.832    1.047      0.001    0.001    4356.0    2704.0    1.0
# Trace plots for the three ρ parameters
az.plot_trace(
    idata_sar,
    var_names=["rho_d", "rho_o", "rho_w"],
    figsize=(10, 6),
)
plt.suptitle("SARFlow — posterior traces", y=1.01)
plt.tight_layout()
plt.show()
../_images/6e791f03e9bd36bf071f2fba0324c88a30317a1b27db933016beccd2fc02cf1a.png
# Posterior densities with true-value markers
fig, axes = plt.subplots(1, 3, figsize=(12, 3))
params = [("rho_d", RHO_D), ("rho_o", RHO_O), ("rho_w", RHO_W)]

for ax, (param, true_val) in zip(axes, params):
    az.plot_posterior(
        idata_sar,
        var_names=[param],
        ax=ax,
        ref_val=true_val,
        hdi_prob=0.94,
    )
    # ax.set_title(f"${param.replace('_', r'\\_')}$ (true = {true_val})")

plt.suptitle("SARFlow — posterior distributions (red line = true value)", y=1.02)
plt.tight_layout()
plt.show()
../_images/d7a945d2ba34d50453926ffc48911a13590a30f7c39a8ca0713a2ea752759b3f.png

3 SARFlowSeparable: Constrained model

SARFlowSeparable imposes the constraint \(\rho_w = -\rho_d \rho_o\), yielding:

  • Exact log-det via eigenvalues of the small \(n \times n\) matrix \(W\) — no Monte Carlo traces needed.

  • Two free parameters (\(\rho_d\), \(\rho_o\)) instead of three.

  • Typically faster mixing because the posterior geometry is simpler.

We generate new data that satisfies the constraint (\(\rho_w = -\rho_d\rho_o = -0.12\)) to give the model the best chance at recovery.

# True parameters for the separable model (asymmetric for identifiability)
RHO_D_SEP = 0.40
RHO_O_SEP = 0.30
RHO_W_SEP = -RHO_D_SEP * RHO_O_SEP  # = -0.12 (constraint)

print(
    f"Separable true values:  rho_d={RHO_D_SEP}  rho_o={RHO_O_SEP}"
    f"  rho_w=-rho_d*rho_o={RHO_W_SEP:.4f}"
)

data_sep = generate_flow_data_separable(
    n,
    G,
    rho_d=RHO_D_SEP,
    rho_o=RHO_O_SEP,
    beta_d=BETA_D,
    beta_o=BETA_O,
    sigma=SIGMA,
    seed=7,
)

y_sep = np.log(data_sep["y_vec"])  # latent SAR scale (lognormal default)
X_sep = data_sep["X"]
cn_sep = data_sep["col_names"]

print(
    f"\ny_sep summary:  min={y_sep.min():.2f}  mean={y_sep.mean():.2f}  max={y_sep.max():.2f}"
)
Separable true values:  rho_d=0.4  rho_o=0.3  rho_w=-rho_d*rho_o=-0.1200

y_sep summary:  min=-8.22  mean=-3.04  max=1.45
sep_flow = SARFlowSeparable(
    y_sep,
    G,
    X_sep,
    col_names=cn_sep,
    # logdet_method="eigenvalue" is the default for SARFlowSeparable
    trace_seed=0,
)

idata_sep = sep_flow.fit(
    draws=1000,
    tune=1000,
    chains=4,
    target_accept=0.9,
    random_seed=42,
    progressbar=True,
)
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 2 jobs)
NUTS: [rho_d, rho_o, beta, sigma]
Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 19 seconds.

summary_sep = sep_flow.summary(var_names=["rho_d", "rho_o"])
print("=== SARFlowSeparable: spatial autoregressive parameters ===")
print(summary_sep.to_string())
print(
    f"\nTrue values:  rho_d={RHO_D_SEP}  rho_o={RHO_O_SEP}"
    f"  (rho_w = -rho_d*rho_o = {RHO_W_SEP:.4f} is derived)"
)
=== SARFlowSeparable: spatial autoregressive parameters ===
        mean     sd  hdi_3%  hdi_97%  mcse_mean  mcse_sd  ess_bulk  ess_tail  r_hat
rho_d  0.174  0.106  -0.035    0.368      0.002    0.002    2965.0    2628.0    1.0
rho_o  0.314  0.098   0.143    0.509      0.002    0.002    2430.0    2454.0    1.0

True values:  rho_d=0.4  rho_o=0.3  (rho_w = -rho_d*rho_o = -0.1200 is derived)
az.plot_trace(
    idata_sep,
    var_names=["rho_d", "rho_o"],
    figsize=(10, 4),
)
plt.suptitle("SARFlowSeparable — posterior traces", y=1.01)
plt.tight_layout()
plt.show()
../_images/0d8e51dc0c286869ef1a65075b160b5ef50fb58dcfaa54bd530fb7a55e847422.png
fig, axes = plt.subplots(1, 2, figsize=(8, 3))
for ax, (param, true_val) in zip(axes, [("rho_d", RHO_D_SEP), ("rho_o", RHO_O_SEP)]):
    az.plot_posterior(
        idata_sep, var_names=[param], ax=ax, ref_val=true_val, hdi_prob=0.94
    )
    ax.set_title(f"${param.replace('_', r'_')}$ (true = {true_val})")

plt.suptitle("SARFlowSeparable — posterior distributions", y=1.02)
plt.tight_layout()
plt.show()
../_images/bf7ab9bfe17e66450b908f7b41b6e04e55d5c8e19f96042e3171128885da66ed.png

4 Design Matrix Details

The flow_design_matrix helper builds the standard LeSage O-D design matrix with destination, origin, and intra-zonal coefficient blocks. Each column of the regional attribute matrix \(X\) (shape \(n \times k\)) produces three columns in the full design matrix: one for destination effects, one for origin effects, and one for intra-zonal effects.

Column group

Construction

Interpretation

intercept

\(\mathbf{1}_N\)

global mean flow

intra_indicator

\(\text{vec}(I_n)\)

intra-zonal dummy

dest_*

\(\iota_n \otimes X\)

destination characteristics

orig_*

\(X \otimes \iota_n\)

origin characteristics

intra_*

\(\text{diag}(I_n) \cdot (I_n \otimes X)\)

intra-zonal attr.

log_distance

\(\log(1 + d_{ij})\)

gravity distance decay

By default the DGP appends a log_distance column built from \(\log(1 + d_{ij})\) and assigns it the coefficient gamma_dist=-0.5 (set gamma_dist=0.0 to neutralise the effect). When you build a design matrix by hand via flow_design_matrix, pass dist=... together with log_distance=True to include the same column.

You can also pass a pre-built pd.DataFrame directly as X — column names are inferred automatically.

# Build a design matrix from scratch using regional attributes
X_regional = rng.standard_normal((n, 2))  # n × k attribute matrix
dm = flow_design_matrix(X_regional, col_names=["income", "pop"])

print(f"Regional attribute matrix: {X_regional.shape}  (n × k)")
print(f"Full O-D design matrix:    {dm.combined.shape}  (N × p)")
print(f"\nColumn names:\n  {dm.feature_names}")

pd.DataFrame(dm.combined[:6], columns=dm.feature_names).round(3)
Regional attribute matrix: (12, 2)  (n × k)
Full O-D design matrix:    (144, 8)  (N × p)

Column names:
  ['intercept', 'intra_indicator', 'dest_income', 'dest_pop', 'orig_income', 'orig_pop', 'intra_income', 'intra_pop']
intercept intra_indicator dest_income dest_pop orig_income orig_pop intra_income intra_pop
0 1.0 1.0 0.305 -1.040 0.305 -1.04 0.305 -1.04
1 1.0 0.0 0.750 0.941 0.305 -1.04 0.000 0.00
2 1.0 0.0 -1.951 -1.302 0.305 -1.04 -0.000 -0.00
3 1.0 0.0 0.128 -0.316 0.305 -1.04 0.000 -0.00
4 1.0 0.0 -0.017 -0.853 0.305 -1.04 -0.000 -0.00
5 1.0 0.0 0.879 0.778 0.305 -1.04 0.000 0.00

5 Spatial Effects

The flow SAR model has a rich effects decomposition due to :cite:t:thomasAgnan2014SpatialEconometric (chapter 83, §83.5). For each regional predictor \(p\), a unit shock to \(X_d^{(p)}\) (destination side) and a unit shock to \(X_o^{(p)}\) (origin side) propagate through the spatial filter \(A = I_N - \rho_d W_d - \rho_o W_o - \rho_w W_w\) to produce five scalar summaries (averaged across the \(n\) perturbed regions):

Effect

Symbol

Cells aggregated

Origin

OE

flows whose origin matches the perturbed region

Destination

DE

flows whose destination matches the perturbed region

Intra

IE

the diagonal \((j, j)\) flow of the perturbed region

Network

NE

all remaining flows

Total

TE

OE + DE + IE + NE

The high-level wrapper model.spatial_effects() returns a tidy DataFrame. Three modes are supported:

  • mode="auto" (default) — combined (sum of dest+orig) when \(X_o = X_d\), otherwise separate.

  • mode="combined" — always sum dest and orig contributions.

  • mode="separate" — always report both sides (Thomas-Agnan §83.5.2).

When intra_* columns are present the destination shock also carries \(\beta_\text{intra}\) at the diagonal cell, since the design matrix sets X_intra = intra_indicator · X_dest.

# Combined effects (default for symmetric Xo == Xd designs).
effects_df = sar_flow.spatial_effects(mode="combined")
effects_df.round(4)
mean ci_lower ci_upper bayes_pvalue
predictor side effect
x0 combined origin 1.0503 0.6952 1.5691 0.0000
x1 combined origin 0.3848 0.0985 0.6795 0.0120
x0 combined destination 1.3697 1.0426 1.8541 0.0000
x1 combined destination -0.3645 -0.6499 -0.0598 0.0245
x0 combined intra 0.1684 0.0958 0.2441 0.0000
x1 combined intra -0.0137 -0.0794 0.0549 0.6860
x0 combined network 2.2972 0.4723 6.7456 0.0000
x1 combined network 0.0436 -0.4727 0.7376 0.9995
x0 combined total 4.8856 2.5974 10.2205 0.0000
x1 combined total 0.0503 -0.8938 1.2163 0.9835
# Small sampler settings used for the demo fits below.
SAMPLE_KWARGS_DEMO = dict(draws=200, tune=200, chains=2,
                          random_seed=0, progressbar=False)

5.1 Asymmetric origin / destination attributes

When destination and origin attribute matrices have different numbers of columns (k_d k_o), pass beta_d and beta_o with different lengths to generate_flow_data. The design matrix layout becomes:

intercept | intra_indicator | dest_*(k_d) | orig_*(k_o) | intra_*(k_d) | dist

The model auto-detects k_d and k_o from column name prefixes (dest_* vs orig_*). spatial_effects() reports destination and origin effects separately when k_d k_o, since the predictors are different variables.

from bayespecon.dgp.flows import generate_flow_data

# Asymmetric: 2 destination attributes, 1 origin attribute
data_asym = generate_flow_data(
    n=5,
    rho_d=0.2, rho_o=0.15, rho_w=0.05,
    beta_d=[1.0, -0.5],   # k_d = 2
    beta_o=[0.5],          # k_o = 1
    sigma=0.5, seed=11,
)
G_asym = data_asym["G"]
print(f"k_d={data_asym['design'].k_d}, k_o={data_asym['design'].k_o}")
print(f"Columns: {data_asym['col_names']}")

model_asym = SARFlow(
    np.log(data_asym["y_vec"]), G_asym, data_asym["X"],
    col_names=data_asym["col_names"],
)
print("symmetric_xo_xd =", model_asym._symmetric_xo_xd)
model_asym.fit(**SAMPLE_KWARGS_DEMO)
model_asym.spatial_effects().round(4)  # auto -> separate
k_d=2, k_o=1
Columns: ['intercept', 'intra_indicator', 'dest_x0', 'dest_x1', 'orig_y0', 'intra_x0', 'intra_x1', 'log_distance']
symmetric_xo_xd = False
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [rho_simplex, beta, sigma]
Sampling 2 chains for 200 tune and 200 draw iterations (400 + 400 draws total) took 18 seconds.
We recommend running at least 4 chains for robust computation of convergence diagnostics
The rhat statistic is larger than 1.01 for some parameters. This indicates problems during sampling. See https://arxiv.org/abs/1903.08008 for details
The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details
mean ci_lower ci_upper bayes_pvalue
predictor side effect
x0 dest origin 0.4821 0.0205 2.1100 0.000
x1 dest origin -0.2589 -1.1976 -0.0098 0.000
x0 dest destination 1.2029 0.5060 2.7569 0.000
x1 dest destination -0.7665 -1.7403 -0.4156 0.000
x0 dest intra 0.3384 0.1113 0.7707 0.005
x1 dest intra -0.1362 -0.3784 -0.0168 0.020
x0 dest network 1.8624 0.0868 8.3674 0.000
x1 dest network -1.1366 -4.9222 -0.0589 0.000
x0 dest total 3.8859 0.9522 14.1595 0.000
x1 dest total -2.2982 -8.2249 -0.6393 0.000
y0 orig origin 0.7337 0.2069 1.4377 0.010
destination 0.1827 0.0069 0.7493 0.010
intra 0.1834 0.0517 0.3594 0.010
network 0.7309 0.0277 2.9970 0.010
total 1.8308 0.3347 5.3013 0.010
# You can also build asymmetric designs directly with flow_design_matrix_asymmetric
from bayespecon.graph import flow_design_matrix_asymmetric

rng_asym = np.random.default_rng(7)
n_demo = 5
Xd = rng_asym.standard_normal((n_demo, 2))  # 2 destination variables
Xo = rng_asym.standard_normal((n_demo, 1))  # 1 origin variable

dm = flow_design_matrix_asymmetric(Xd, Xo)
print(f"Design shape: {dm.combined.shape}")
print(f"k_d={dm.k_d}, k_o={dm.k_o}")
print(f"Feature names: {dm.feature_names}")
Design shape: (25, 7)
k_d=2, k_o=1
Feature names: ['intercept', 'intra_indicator', 'dest_x0', 'dest_x1', 'orig_y0', 'intra_x0', 'intra_x1']

5.2 Non-spatial OLS gravity baseline (OLSFlow)

OLSFlow is the conventional log-linear gravity model (Thomas-Agnan & LeSage 2014, eq. 83.2): no spatial-lag terms, just \(y = X \beta + \varepsilon\) with iid Gaussian errors. It uses the same spatial_effects() API and reproduces the closed-form expressions in Table 83.1 (with \(A = I_N\)):

\[ \mathrm{TE} = \beta_d + \beta_o, \quad \mathrm{OE} = \tfrac{n-1}{n}\beta_o, \quad \mathrm{DE} = \tfrac{n-1}{n}\beta_d, \quad \mathrm{IE} = (\beta_d + \beta_o + \beta_\text{intra}) / n, \quad \mathrm{NE} = 0. \]
from bayespecon import OLSFlow

ols_flow = OLSFlow(y_vec, G, X, col_names=col_names)
ols_flow.fit(**SAMPLE_KWARGS_DEMO)
ols_flow.spatial_effects(mode="combined").round(4)
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [beta, sigma]
Sampling 2 chains for 200 tune and 200 draw iterations (400 + 400 draws total) took 5 seconds.
We recommend running at least 4 chains for robust computation of convergence diagnostics
The rhat statistic is larger than 1.01 for some parameters. This indicates problems during sampling. See https://arxiv.org/abs/1903.08008 for details
The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details
mean ci_lower ci_upper bayes_pvalue
predictor side effect
x0 combined origin 0.8135 0.5986 1.0445 0.000
x1 combined origin 0.2887 0.0519 0.5308 0.010
x0 combined destination 1.2213 1.0018 1.4380 0.000
x1 combined destination -0.4622 -0.7067 -0.2168 0.000
x0 combined intra 0.1431 0.0657 0.2103 0.000
x1 combined intra -0.0229 -0.0840 0.0452 0.495
x0 combined network 0.0000 0.0000 0.0000 0.000
x1 combined network 0.0000 0.0000 0.0000 0.000
x0 combined total 2.1779 1.8483 2.4992 0.000
x1 combined total -0.1964 -0.5958 0.1888 0.310

6 MCMC Diagnostics

Good practice after any Bayesian fit: check \(\hat{R}\) (should be \(< 1.01\)) and effective sample size (ESS > 400 per chain).

for label, idata in [("SARFlow", idata_sar), ("SARFlowSeparable", idata_sep)]:
    diag = az.summary(idata, var_names=["rho_d", "rho_o"], stat_focus="mean")
    rhat_ok = (diag["r_hat"] < 1.01).all()
    ess_ok = (diag["ess_bulk"] > 400).all()
    print(f"{label:25s}  r_hat<1.01: {rhat_ok}   ess_bulk>400: {ess_ok}")
    print(diag[["mean", "sd", "hdi_3%", "hdi_97%", "ess_bulk", "r_hat"]].to_string())
    print()
SARFlow                    r_hat<1.01: True   ess_bulk>400: True
        mean     sd  hdi_3%  hdi_97%  ess_bulk  r_hat
rho_d  0.129  0.086     0.0    0.277    2403.0    1.0
rho_o  0.121  0.082     0.0    0.265    2364.0    1.0

SARFlowSeparable           r_hat<1.01: True   ess_bulk>400: True
        mean     sd  hdi_3%  hdi_97%  ess_bulk  r_hat
rho_d  0.174  0.106  -0.035    0.368    2965.0    1.0
rho_o  0.314  0.098   0.143    0.509    2430.0    1.0

Spatial-parameter adequacy

Flow models have three spatial scalars (\(\rho_d\), \(\rho_o\), \(\rho_w\)), and each is prone to the slow-mixing behaviour discussed in Wolf, Anselin & Arribas-Bel (2018) [Wolf et al., 2018]. The spatial_mcmc_diagnostic helper auto-detects all three and reports ESS, sampler yield, \(\hat{R}\), and HPDI stability.

from bayespecon import spatial_mcmc_diagnostic

spatial_mcmc_diagnostic(sar_flow, emit_warnings=False).to_frame()
ess_bulk ess_tail r_hat mcse_mean yield_pct hpdi_drift_pct adequate
parameter
rho_d 2403.301734 1642.150753 1.001299 0.001630 60.082543 4.102400 True
rho_o 2364.052318 1668.068665 1.001046 0.001575 59.101308 4.132537 True
rho_w 3308.179383 2053.519490 1.000170 0.002483 82.704485 3.997955 True
# Energy plot — checks that HMC is exploring the posterior efficiently
fig, axes = plt.subplots(1, 2, figsize=(10, 3))
az.plot_energy(idata_sar, ax=axes[0])
axes[0].set_title("SARFlow energy")
az.plot_energy(idata_sep, ax=axes[1])
axes[1].set_title("SARFlowSeparable energy")
plt.tight_layout()
plt.show()
../_images/838c377c3dbb9c5aa5c1d82ced1bf3f44fc2e8dc0be421966ebcfce7cf0f5fe5.png

7 Model Comparison

The WAIC / LOO-CV scores can be used to compare SARFlow and SARFlowSeparable when both are estimated on the same data. A lower ELPD (expected log pointwise predictive density) indicates a worse-fitting model.

Note: here we fit SARFlow to the separable data to make the comparison meaningful — the separable model is nested in the unrestricted one.

# Fit unrestricted SARFlow on the same separable data for a fair comparison
sar_flow_on_sep = SARFlow(
    y_sep,
    G,
    X_sep,
    col_names=cn_sep,
    logdet_method="traces",
    miter=20,
    trace_seed=0,
)
idata_sar_on_sep = sar_flow_on_sep.fit(
    draws=1000,
    tune=1000,
    chains=4,
    target_accept=0.9,
    random_seed=42,
    progressbar=True,
)
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 2 jobs)
NUTS: [rho_simplex, beta, sigma]
Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 92 seconds.

# LOO-CV comparison (requires log-likelihood in idata; falls back to WAIC if unavailable)
try:
    loo_sar = az.loo(idata_sar_on_sep)
    loo_sep = az.loo(idata_sep)
    comparison = az.compare(
        {"SARFlow (3 ρ)": idata_sar_on_sep, "SARFlowSeparable": idata_sep}
    )
    print("LOO-CV model comparison:")
    print(
        comparison[
            ["elpd_loo", "p_loo", "elpd_diff", "weight", "se", "warning"]
        ].to_string()
    )
except Exception as e:
    print(f"LOO not available (log-likelihood not stored): {e}")
    print("Tip: pass compute_log_likelihood=True to fit() to enable LOO/WAIC.")
LOO-CV model comparison:
                    elpd_loo      p_loo  elpd_diff  weight        se  warning
SARFlowSeparable -195.044604  11.321569   0.000000     1.0  8.482200    False
SARFlow (3 ρ)    -195.511241  11.185733   0.466637     0.0  8.521106    False

8 Prior Sensitivity and Stability Constraint

By default SARFlow uses a Dirichlet prior that forces \(\rho_d, \rho_o, \rho_w \geq 0\) and \(\rho_d + \rho_o + \rho_w \leq 1\). Setting restrict_positive=False allows negative spillovers via independent Uniform(-1, 1) priors plus a differentiable stability wall potential.

Use restrict_positive=False when competitive effects (e.g. negative network parameter) are theoretically expected.

# Fit with restrict_positive=False to allow negative rho values
sar_flow_neg = SARFlow(
    y_vec,
    G,
    X,
    col_names=col_names,
    logdet_method="traces",
    restrict_positive=False,  # Uniform(-1,1) priors + stability potential
    miter=20,
    trace_seed=0,
)
idata_neg = sar_flow_neg.fit(
    draws=800,
    tune=1000,
    chains=4,
    target_accept=0.95,
    random_seed=42,
    progressbar=True,
)

summary_neg = sar_flow_neg.summary()
print("=== SARFlow (restrict_positive=False) ===")
print(summary_neg[["mean", "sd", "hdi_3%", "hdi_97%", "r_hat"]].to_string())
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 2 jobs)
NUTS: [rho_d, rho_o, rho_w, beta, sigma]
Sampling 4 chains for 1_000 tune and 800 draw iterations (4_000 + 3_200 draws total) took 48 seconds.
There were 2 divergences after tuning. Increase `target_accept` or reparameterize.

=== SARFlow (restrict_positive=False) ===
                        mean     sd  hdi_3%  hdi_97%  r_hat
rho_d                  0.087  0.123  -0.127    0.333    1.0
rho_o                  0.056  0.126  -0.184    0.283    1.0
rho_w                  0.365  0.182   0.025    0.716    1.0
beta[intercept]       -0.167  0.345  -0.812    0.478    1.0
beta[intra_indicator]  0.057  0.471  -0.836    0.911    1.0
beta[dest_x0]          1.201  0.187   0.866    1.553    1.0
beta[dest_x1]         -0.377  0.134  -0.635   -0.137    1.0
beta[orig_x0]          0.843  0.166   0.522    1.146    1.0
beta[orig_x1]          0.382  0.140   0.114    0.645    1.0
beta[intra_x0]        -0.404  0.447  -1.218    0.466    1.0
beta[intra_x1]        -0.195  0.439  -1.029    0.610    1.0
beta[log_distance]    -0.584  0.363  -1.196    0.140    1.0
sigma                  0.933  0.057   0.825    1.037    1.0

10 Bayesian LM Diagnostics

Once a baseline gravity model has been fit, the Bayesian Lagrange-multiplier (LM) tests in bayespecon.diagnostics indicate which spatial-lag directions (destination, origin, network) are worth adding — and which intra-block columns deserve their own coefficients. They are the OD-flow analogues of the classic Anselin/Koley–Bera LM family, ported to the posterior-predictive score formulation of Doğan, Taşpınar & Bera (2021).

For each posterior draw \(g\) from a fitted null model with residuals \(e_g = y - X\beta_g\), the score for direction \(i \in \{d, o, w\}\) is

\[ s_g^{(i)} = (W_i\, y)^\top e_g, \qquad W_d = I_n \otimes W,\; W_o = W \otimes I_n,\; W_w = W \otimes W, \]

and the information matrix uses the cached Kronecker-trace block \(T_{\text{flow}}\) (computed in \(\mathcal{O}(\mathrm{nnz})\) from flow_trace_blocks(W)):

\[ J = T_{\text{flow}}\,\bar\sigma^{2} + Q, \qquad Q_{ij} = (W_i y)^\top (W_j y). \]

The marginal statistic is \(LM_g = (s_g^{(i)})^2 / J_{ii}\) (\(\chi^2_1\)); the joint statistic is \(g_g^\top J^{-1} g_g\) (\(\chi^2_3\)). The Bayesian \(p\)-value reports \(1 - F_{\chi^2_{\text{df}}}(\overline{LM})\).

from bayespecon import (
    bayesian_lm_flow_dest_test,
    bayesian_lm_flow_orig_test,
    bayesian_lm_flow_network_test,
    bayesian_lm_flow_joint_test,
    bayesian_lm_flow_intra_test,
)

# All marginal + joint + intra tests, evaluated at the OLSFlow null draws.
flow_lm_tests = {
    "rho_d (dest)":     bayesian_lm_flow_dest_test(ols_flow),
    "rho_o (orig)":     bayesian_lm_flow_orig_test(ols_flow),
    "rho_w (network)":  bayesian_lm_flow_network_test(ols_flow),
    "joint (d,o,w)":    bayesian_lm_flow_joint_test(ols_flow),
    "intra block":      bayesian_lm_flow_intra_test(ols_flow),
}

import pandas as pd
pd.DataFrame(
    [(k, r.df, r.mean, r.bayes_pvalue) for k, r in flow_lm_tests.items()],
    columns=["test", "df", "LM mean", "Bayes p"],
).round(4)
test df LM mean Bayes p
0 rho_d (dest) 1 1.3262 0.2495
1 rho_o (orig) 1 1.1684 0.2797
2 rho_w (network) 1 2.9788 0.0844
3 joint (d,o,w) 3 3.9794 0.2637
4 intra block 3 2.6648 0.4462

10.1 Robust (Neyman-orthogonal) tests

The marginal LM tests assume the other two ρ parameters are zero under the null. When that assumption is in doubt, evaluate the score and information at the alternative SARFlow posterior and apply the Neyman-orthogonal adjustment

\[ s^{*}_i = s_i - J_{i,\nu} J_{\nu,\nu}^{-1} s_\nu, \qquad V^{*} = J_{ii} - J_{i,\nu} J_{\nu,\nu}^{-1} J_{\nu,i}, \]

where \(\nu\) indexes the two nuisance directions. The result is robust to local misspecification of the nuisance ρ’s (Doğan et al. 2021, Proposition 3).

from bayespecon import (
    bayesian_robust_lm_flow_dest_test,
    bayesian_robust_lm_flow_orig_test,
    bayesian_robust_lm_flow_network_test,
)

robust_tests = {
    "rho_d | (rho_o, rho_w)": bayesian_robust_lm_flow_dest_test(sar_flow),
    "rho_o | (rho_d, rho_w)": bayesian_robust_lm_flow_orig_test(sar_flow),
    "rho_w | (rho_d, rho_o)": bayesian_robust_lm_flow_network_test(sar_flow),
}

pd.DataFrame(
    [(k, r.mean, r.bayes_pvalue) for k, r in robust_tests.items()],
    columns=["test", "LM* mean", "Bayes p"],
).round(4)
test LM* mean Bayes p
0 rho_d | (rho_o, rho_w) 0.5422 0.4615
1 rho_o | (rho_d, rho_w) 0.4449 0.5047
2 rho_w | (rho_d, rho_o) 0.5610 0.4538

10.2 Panel analogues

For a fitted OLSFlowPanel, the same tests are available with the bayesian_panel_lm_flow_* prefix. Scores accumulate over the demeaned panel stack of length \(n^2 \cdot T\) and the information matrix scales the trace block by \(T\) to reflect i.i.d. within-period contributions under \(H_0\):

from bayespecon import (
    bayesian_panel_lm_flow_dest_test,
    bayesian_panel_lm_flow_orig_test,
    bayesian_panel_lm_flow_network_test,
    bayesian_panel_lm_flow_joint_test,
    bayesian_panel_lm_flow_intra_test,
)

11 Quick Reference

from bayespecon import OLSFlow, SARFlow, SARFlowSeparable
from bayespecon import flow_design_matrix
from bayespecon.dgp.flows import generate_flow_data, generate_flow_data_separable

# Build design matrix from regional attributes
dm = flow_design_matrix(X_regional, col_names=["income", "pop"])

# --- SARFlow (three free ρ parameters) ---
model = SARFlow(
    y, G, dm.combined,
    col_names=dm.feature_names,
    logdet_method="traces",      # Barry-Pace stochastic traces (default)
    restrict_positive=True,      # Dirichlet stability prior
)
idata = model.fit(draws=2000, tune=1000, chains=4, random_seed=0)
model.summary(var_names=["rho_d", "rho_o", "rho_w"])

# --- SARFlowSeparable (ρ_w = -ρ_d·ρ_o, exact log-det) ---
sep_model = SARFlowSeparable(
    y, G, dm.combined,
    col_names=dm.feature_names,
    # logdet_method="eigenvalue" is the default
)
idata_sep = sep_model.fit(draws=2000, tune=1000, chains=4, random_seed=0)

# --- OLSFlow (non-spatial gravity baseline) ---
ols = OLSFlow(y, G, dm.combined, col_names=dm.feature_names)
idata_ols = ols.fit(draws=2000, tune=1000, chains=4, random_seed=0)
ols.spatial_effects(mode="combined")  # closed-form Table 83.1

# --- OLSFlow (non-spatial gravity baseline) ---
ols = OLSFlow(y, G, dm.combined, col_names=dm.feature_names)
idata_ols = ols.fit(draws=2000, tune=1000, chains=4, random_seed=0)
ols.spatial_effects(mode="combined")  # closed-form Table 83.1

Key parameters

Parameter

Default

Description

logdet_method

"traces"

"traces" (Barry–Pace) or "eigenvalue"/"chebyshev" for separable

restrict_positive

True

Dirichlet prior; set False for negative-spillover models

miter

30

Trace polynomial order (higher = more accurate, slower precomputation)

trace_riter

50

Monte Carlo probes for trace estimation

Model choice guide

Data characteristic

Recommended model

No prior on sign of spatial effects

SARFlow(restrict_positive=False)

Positive spillovers expected

SARFlow(restrict_positive=True)

Separability plausible, fast inference needed

SARFlowSeparable

Count/non-negative flows

PoissonSARFlow or PoissonSARFlowSeparable