bayespecon.models.flow.SARNegBinFlowSeparableLatent¶

class bayespecon.models.flow.SARNegBinFlowSeparableLatent(y, G, X, **kwargs)[source]¶

Bayesian separable SAR-NB flow model with Pólya–Gamma Gibbs sampler.

Same observable likelihood as SARFlowSeparable (Gaussian) but with a Negative Binomial observation model and Pólya–Gamma data augmentation, yielding fully conjugate Gibbs updates for \(\eta\), \(\beta\), and \(\sigma^2\).

The separability constraint \(\rho_w = -\rho_d \rho_o\) enables the Kronecker-structured sampler with \(O(n^3)\) matvec instead of \(O(n^6)\), making it practical for \(n \geq 50\).

Parameters:¶

y : array-like of int, shape (n, n) or (N,)¶

Observed non-negative integer flow counts.

G : libpysal.graph.Graph¶

Row-standardised spatial graph on n units.

X : np.ndarray or pandas.DataFrame, shape (N, p)¶

Full origin-destination design matrix with \(N = n^2\) rows.

col_names : list of str, optional

Column labels for X.

k : int, optional

Number of regional attribute columns.

logdet_method : {"traces", "eigenvalue", "chebyshev"}, default "chebyshev"

Method for the log-determinant. "traces" precomputes \(N \\times N\) flow traces (also used by Bayesian LM diagnostics); "eigenvalue" uses the Kronecker-factored \(n \\log|I_n - \\rho_d W| + n \\log|I_n - \\rho_o W|\); "chebyshev" is a polynomial approximation for large n.

priors : dict, optional

Override default priors. Supported keys:

beta_mu : float, default 0.0 — Normal prior mean for beta.
beta_sigma : float, default 1e6 — Normal prior std for beta.
sigma_sigma : float, default 10.0 — HalfNormal prior std for sigma.
alpha_sigma : float, default 10.0 — HalfNormal prior std for alpha.
rho_lower : float, default -0.999 — Lower bound for each \(\rho\).
rho_upper : float, default 0.999 — Upper bound for each \(\rho\).

Notes

The sampler bypasses PyMC’s NUTS entirely. It produces an arviz.InferenceData object compatible with all downstream diagnostics.

\(\\alpha\) (NB dispersion) mixing can be slower than \(\\rho\) or \(\\beta\). Monitor ESS for \(\\alpha\) specifically and use longer runs if needed.

__init__(y, G, X, **kwargs)[source]¶

Methods

`__init__`(y, G, X, **kwargs)
`fit`([draws, tune, chains, random_seed, ...])	Sample posterior via Pólya–Gamma block Gibbs (separable).
`fit_approx`([draws, n, method, random_seed, ...])	Fit a variational approximation and return posterior draws.
`posterior_predictive`([n_draws, random_seed])	Draw posterior-predictive samples `y_rep`.
`spatial_diagnostics`()	Run Bayesian LM specification tests for flow models.
`spatial_diagnostics_decision`([alpha, format])	Return a model-selection decision from Bayesian LM test results.
`spatial_effects`([draws, ...])	Summarise posterior origin/destination/intra/network/total effects.
`summary`([var_names])	Return posterior summary table via ArviZ.

Attributes

`approximation`	Return the most recent PyMC variational approximation, if any.
`inference_data`	Return ArviZ InferenceData from the most recent fit, or None.
`pymc_model`	Return the PyMC model used for the most recent fit, or None.

property approximation[source]¶: Return the most recent PyMC variational approximation, if any.

fit(draws=2000, tune=1000, chains=4, random_seed=None, thin=1, return_eta=False, n_jobs=-1, progressbar=True, chebyshev_degree=30, **kwargs)[source]¶

Sample posterior via Pólya–Gamma block Gibbs (separable).

Parameters:¶

draws : int¶: Number of post-warmup draws per chain.
tune : int¶: Number of warmup draws per chain.
chains : int¶: Number of independent chains.
random_seed : int or None¶: Seed for reproducibility.
thin : int¶: Keep every thin-th draw. Default 1.
return_eta : bool¶: If True, store the full latent field η. Default False.
n_jobs : int¶: Number of parallel chains. -1 = all CPUs.
progressbar : bool¶: Show per-chain progress bars.
chebyshev_degree : int, default 30¶: Chebyshev polynomial degree for η draw.

Return type:¶

arviz.InferenceData

fit_approx(draws=2000, n=10000, method='advi', random_seed=None, store_lambda=False, compute_log_likelihood=True, **fit_kwargs)[source]¶

Fit a variational approximation and return posterior draws.

Parameters:¶

draws : int, default 2000¶: Number of samples to draw from the fitted approximation.
n : int, default 10000¶: Number of optimisation iterations for pm.fit.
method : {"advi", "fullrank_advi"}, default "advi"¶: Variational inference family to fit.
random_seed : int, optional¶: Seed for optimisation and posterior sampling.
store_lambda : bool, default False¶: If True, keep the high-dimensional fitted mean lambda in the posterior draws.
compute_log_likelihood : bool, default True¶: If True, compute pointwise log-likelihood after sampling and attach to the InferenceData (with Jacobian correction for SAR flow variants), enabling az.loo / az.waic.
**fit_kwargs¶: Additional keyword arguments forwarded to pm.fit.

property inference_data : arviz.data.inference_data.InferenceData | None[source]¶: Return ArviZ InferenceData from the most recent fit, or None.

posterior_predictive(n_draws=None, random_seed=None)[source]¶

Draw posterior-predictive samples y_rep.

For each (subsampled) posterior draw, simulates a new flow vector y_rep from the implied data-generating process by solving the sparse system A(rho) y_rep = X β + ε (Gaussian) or y_rep ~ Poisson(exp(A^{-1} X β)) (Poisson variants).

Parameters:¶

n_draws : int, optional¶: Number of posterior draws to use. Defaults to all available.
random_seed : int, optional¶: Seed for the noise/Poisson sampler.

Returns:¶

Array of shape (n_draws, N) with posterior-predictive flows.

Return type:¶

np.ndarray

property pymc_model : pymc.model.core.Model | None[source]¶: Return the PyMC model used for the most recent fit, or None.

spatial_diagnostics()[source]¶

Run Bayesian LM specification tests for flow models.

Looks up the diagnostic suite registered for this model class and returns a tidy DataFrame with one row per test. See bayespecon.models.base.SpatialModel.spatial_diagnostics() for the column schema.

Raises:¶: RuntimeError – If the model has not been fit yet.

spatial_diagnostics_decision(alpha=0.05, format='graphviz')[source]¶

Return a model-selection decision from Bayesian LM test results.

Walks the flow decision tree using Bayesian p-values from spatial_diagnostics() and recommends either OLSFlow (no spatial dependence detected) or SARFlow (at least one direction is significant).

Parameters:¶

alpha : float, default 0.05¶: Significance level for the Bayesian p-values.
format : {"graphviz", "ascii", "model"}, default "graphviz"¶: Output format. "model" returns the recommended model name string. "ascii" returns an indented box-drawing tree. "graphviz" returns a graphviz.Digraph (with ASCII fallback if graphviz is not installed).

Return type:¶

str or graphviz.Digraph

spatial_effects(draws=None, return_posterior_samples=False, ci=0.95, mode='auto')[source]¶

Summarise posterior origin/destination/intra/network/total effects.

Wraps _compute_spatial_effects_posterior() to produce a tidy DataFrame indexed by predictor with posterior means, credible-interval bounds, and Bayesian p-values for each effect type (origin, destination, intra, network, total). Following Thomas-Agnan & LeSage (2014, §83.5.2), when destination and origin design blocks differ the decomposition is reported separately for shocks applied to each side.

Parameters:¶

draws : int, optional¶: Maximum number of posterior draws to use. Defaults to all.
return_posterior_samples : bool, default False¶: If True, also return the underlying posterior-draw arrays.
ci : float, default 0.95¶: Credible-interval coverage.
mode : {"auto", "combined", "separate"}, default "auto"¶: Controls whether destination- and origin-side effects are summed or reported separately. "auto" collapses to combined when the destination and origin design blocks are identical (self._symmetric_xo_xd) and reports both sides otherwise. "combined" always sums; "separate" always reports both.

Returns:¶

Long-format summary indexed by (predictor, side, effect) where side is one of "combined", "dest", "orig".

Return type:¶

pandas.DataFrame, or (DataFrame, dict)

summary(var_names=None, **kwargs)[source]¶

Return posterior summary table via ArviZ.

Parameters:¶

var_names : list, optional¶: Variable names to include. Defaults to all parameters.
**kwargs¶: Additional keyword arguments forwarded to az.summary.

Return type:¶

pandas.DataFrame