bayespecon.models.flow.SARNegBinFlowSeparableLatent

class bayespecon.models.flow.SARNegBinFlowSeparableLatent(y, G, X, **kwargs)[source]

Bayesian separable SAR-NB flow model with Pólya–Gamma Gibbs sampler.

Same observable likelihood as SARFlowSeparable (Gaussian) but with a Negative Binomial observation model and Pólya–Gamma data augmentation, yielding fully conjugate Gibbs updates for \(\eta\), \(\beta\), and \(\sigma^2\).

The separability constraint \(\rho_w = -\rho_d \rho_o\) enables the Kronecker-structured sampler with \(O(n^3)\) matvec instead of \(O(n^6)\), making it practical for \(n \geq 50\).

Parameters:
y : array-like of int, shape (n, n) or (N,)

Observed non-negative integer flow counts.

G : libpysal.graph.Graph

Row-standardised spatial graph on n units.

X : np.ndarray or pandas.DataFrame, shape (N, p)

Full origin-destination design matrix with \(N = n^2\) rows.

col_names : list of str, optional

Column labels for X.

k : int, optional

Number of regional attribute columns.

logdet_method : {"traces", "eigenvalue", "chebyshev"}, default "chebyshev"

Method for the log-determinant. "traces" precomputes \(N \\times N\) flow traces (also used by Bayesian LM diagnostics); "eigenvalue" uses the Kronecker-factored \(n \\log|I_n - \\rho_d W| + n \\log|I_n - \\rho_o W|\); "chebyshev" is a polynomial approximation for large n.

priors : dict, optional

Override default priors. Supported keys:

  • beta_mu : float, default 0.0 — Normal prior mean for beta.

  • beta_sigma : float, default 1e6 — Normal prior std for beta.

  • sigma_sigma : float, default 10.0 — HalfNormal prior std for sigma.

  • alpha_sigma : float, default 10.0 — HalfNormal prior std for alpha.

  • rho_lower : float, default -0.999 — Lower bound for each \(\rho\).

  • rho_upper : float, default 0.999 — Upper bound for each \(\rho\).

Notes

The sampler bypasses PyMC’s NUTS entirely. It produces an arviz.InferenceData object compatible with all downstream diagnostics.

\(\\alpha\) (NB dispersion) mixing can be slower than \(\\rho\) or \(\\beta\). Monitor ESS for \(\\alpha\) specifically and use longer runs if needed.

__init__(y, G, X, **kwargs)[source]

Methods

__init__(y, G, X, **kwargs)

fit([draws, tune, chains, random_seed, ...])

Sample posterior via Pólya–Gamma block Gibbs (separable).

fit_approx([draws, n, method, random_seed, ...])

Fit a variational approximation and return posterior draws.

posterior_predictive([n_draws, random_seed])

Draw posterior-predictive samples y_rep.

spatial_diagnostics()

Run Bayesian LM specification tests for flow models.

spatial_diagnostics_decision([alpha, format])

Return a model-selection decision from Bayesian LM test results.

spatial_effects([draws, ...])

Summarise posterior origin/destination/intra/network/total effects.

summary([var_names])

Return posterior summary table via ArviZ.

Attributes

approximation

Return the most recent PyMC variational approximation, if any.

inference_data

Return ArviZ InferenceData from the most recent fit, or None.

pymc_model

Return the PyMC model used for the most recent fit, or None.

property approximation[source]

Return the most recent PyMC variational approximation, if any.

fit(draws=2000, tune=1000, chains=4, random_seed=None, thin=1, return_eta=False, n_jobs=-1, progressbar=True, chebyshev_degree=30, **kwargs)[source]

Sample posterior via Pólya–Gamma block Gibbs (separable).

Parameters:
draws : int

Number of post-warmup draws per chain.

tune : int

Number of warmup draws per chain.

chains : int

Number of independent chains.

random_seed : int or None

Seed for reproducibility.

thin : int

Keep every thin-th draw. Default 1.

return_eta : bool

If True, store the full latent field η. Default False.

n_jobs : int

Number of parallel chains. -1 = all CPUs.

progressbar : bool

Show per-chain progress bars.

chebyshev_degree : int, default 30

Chebyshev polynomial degree for η draw.

Return type:

arviz.InferenceData

fit_approx(draws=2000, n=10000, method='advi', random_seed=None, store_lambda=False, compute_log_likelihood=True, **fit_kwargs)[source]

Fit a variational approximation and return posterior draws.

Parameters:
draws : int, default 2000

Number of samples to draw from the fitted approximation.

n : int, default 10000

Number of optimisation iterations for pm.fit.

method : {"advi", "fullrank_advi"}, default "advi"

Variational inference family to fit.

random_seed : int, optional

Seed for optimisation and posterior sampling.

store_lambda : bool, default False

If True, keep the high-dimensional fitted mean lambda in the posterior draws.

compute_log_likelihood : bool, default True

If True, compute pointwise log-likelihood after sampling and attach to the InferenceData (with Jacobian correction for SAR flow variants), enabling az.loo / az.waic.

**fit_kwargs

Additional keyword arguments forwarded to pm.fit.

property inference_data : arviz.data.inference_data.InferenceData | None[source]

Return ArviZ InferenceData from the most recent fit, or None.

posterior_predictive(n_draws=None, random_seed=None)[source]

Draw posterior-predictive samples y_rep.

For each (subsampled) posterior draw, simulates a new flow vector y_rep from the implied data-generating process by solving the sparse system A(rho) y_rep = X β + ε (Gaussian) or y_rep ~ Poisson(exp(A^{-1} X β)) (Poisson variants).

Parameters:
n_draws : int, optional

Number of posterior draws to use. Defaults to all available.

random_seed : int, optional

Seed for the noise/Poisson sampler.

Returns:

Array of shape (n_draws, N) with posterior-predictive flows.

Return type:

np.ndarray

property pymc_model : pymc.model.core.Model | None[source]

Return the PyMC model used for the most recent fit, or None.

spatial_diagnostics()[source]

Run Bayesian LM specification tests for flow models.

Looks up the diagnostic suite registered for this model class and returns a tidy DataFrame with one row per test. See bayespecon.models.base.SpatialModel.spatial_diagnostics() for the column schema.

Raises:

RuntimeError – If the model has not been fit yet.

spatial_diagnostics_decision(alpha=0.05, format='graphviz')[source]

Return a model-selection decision from Bayesian LM test results.

Walks the flow decision tree using Bayesian p-values from spatial_diagnostics() and recommends either OLSFlow (no spatial dependence detected) or SARFlow (at least one direction is significant).

Parameters:
alpha : float, default 0.05

Significance level for the Bayesian p-values.

format : {"graphviz", "ascii", "model"}, default "graphviz"

Output format. "model" returns the recommended model name string. "ascii" returns an indented box-drawing tree. "graphviz" returns a graphviz.Digraph (with ASCII fallback if graphviz is not installed).

Return type:

str or graphviz.Digraph

spatial_effects(draws=None, return_posterior_samples=False, ci=0.95, mode='auto')[source]

Summarise posterior origin/destination/intra/network/total effects.

Wraps _compute_spatial_effects_posterior() to produce a tidy DataFrame indexed by predictor with posterior means, credible-interval bounds, and Bayesian p-values for each effect type (origin, destination, intra, network, total). Following Thomas-Agnan & LeSage (2014, §83.5.2), when destination and origin design blocks differ the decomposition is reported separately for shocks applied to each side.

Parameters:
draws : int, optional

Maximum number of posterior draws to use. Defaults to all.

return_posterior_samples : bool, default False

If True, also return the underlying posterior-draw arrays.

ci : float, default 0.95

Credible-interval coverage.

mode : {"auto", "combined", "separate"}, default "auto"

Controls whether destination- and origin-side effects are summed or reported separately. "auto" collapses to combined when the destination and origin design blocks are identical (self._symmetric_xo_xd) and reports both sides otherwise. "combined" always sums; "separate" always reports both.

Returns:

Long-format summary indexed by (predictor, side, effect) where side is one of "combined", "dest", "orig".

Return type:

pandas.DataFrame, or (DataFrame, dict)

summary(var_names=None, **kwargs)[source]

Return posterior summary table via ArviZ.

Parameters:
var_names : list, optional

Variable names to include. Defaults to all parameters.

**kwargs

Additional keyword arguments forwarded to az.summary.

Return type:

pandas.DataFrame