bayespecon.models.flow.SARNegBinFlowLatent¶
- class bayespecon.models.flow.SARNegBinFlowLatent(y, G, X, **kwargs)[source]¶
Bayesian structural-form SAR-NB flow model with Pólya–Gamma Gibbs sampler.
The structural form parameterises the latent log-mean as
\[\eta = \rho_d W_d \eta + \rho_o W_o \eta + \rho_w W_w \eta + X\beta + \nu, \quad \nu \sim N(0, \sigma^2 I_N)\]where \(N = n^2\) and \(W_d = I_n \otimes W\), \(W_o = W \otimes I_n\), \(W_w = W \otimes W\).
Three free \(\rho\) parameters are estimated via collapsed 1-D slice samplers (one per \(\rho\), cycling with the others fixed). The \(\eta\) draw uses the general sparse \(N \times N\) precision matrix with Chebyshev polynomial approximation.
Use this model when: - The separability constraint \(\rho_w = -\rho_d \rho_o\) is too
restrictive for the data.
You need to test whether \(\rho_w\) is significantly different from \(-\rho_d \rho_o\).
Use
SARNegBinFlowSeparableLatentwhen: - The separability constraint is plausible (most flow applications). - You want the faster \(O(n^3)\) Kronecker-structured sampler.- Parameters:¶
- y : array-like of int, shape (n, n) or (N,)¶
Observed non-negative integer flow counts.
- G : libpysal.graph.Graph¶
Row-standardised spatial graph on n units.
- X : np.ndarray or pandas.DataFrame, shape (N, p)¶
Full origin-destination design matrix with \(N = n^2\) rows.
- col_names : list of str, optional
Column labels for
X.- k : int, optional
Number of regional attribute columns.
- logdet_method : str, default "traces"
Log-determinant method for the \(N \\times N\) flow log-determinant. Only
"traces"is supported because the 3-\(\\rho\) logdet \(\\log|I_N - \\rho_d W_d - \\rho_o W_o - \\rho_w W_w|\) cannot be decomposed into \(n \\times n\) eigenvalues.- priors : dict, optional
Override default priors. Supported keys:
beta_mu: float, default 0.0 — Normal prior mean forbeta.beta_sigma: float, default 1e6 — Normal prior std forbeta.sigma_sigma: float, default 10.0 — HalfNormal prior std forsigma.alpha_sigma: float, default 10.0 — HalfNormal prior std foralpha.rho_lower: float, default -0.999 — Lower bound for each \(\rho\).rho_upper: float, default 0.999 — Upper bound for each \(\rho\).
Notes
The sampler bypasses PyMC’s NUTS entirely. It produces an
arviz.InferenceDataobject compatible with all downstream diagnostics (spatial_diagnostics(),spatial_effects(),summary()).The
fit()method does not acceptnuts_samplerortarget_acceptkwargs — these are NUTS-specific and will raiseTypeErrorif passed.\(\\alpha\) (NB dispersion) mixing can be slower than \(\\rho\) or \(\\beta\). Monitor ESS for \(\\alpha\) specifically and use longer runs if needed.
Methods
__init__(y, G, X, **kwargs)fit([draws, tune, chains, random_seed, ...])Sample posterior via Pólya–Gamma block Gibbs.
fit_approx([draws, n, method, random_seed, ...])Fit a variational approximation and return posterior draws.
posterior_predictive([n_draws, random_seed])Draw posterior-predictive samples
y_rep.Run Bayesian LM specification tests for flow models.
spatial_diagnostics_decision([alpha, format])Return a model-selection decision from Bayesian LM test results.
spatial_effects([draws, ...])Summarise posterior origin/destination/intra/network/total effects.
summary([var_names])Return posterior summary table via ArviZ.
Attributes
Return the most recent PyMC variational approximation, if any.
Return ArviZ InferenceData from the most recent fit, or None.
Return the PyMC model used for the most recent fit, or None.
-
fit(draws=
2000, tune=1000, chains=4, random_seed=None, thin=1, return_eta=False, n_jobs=-1, progressbar=True, chebyshev_degree=30, **kwargs)[source]¶ Sample posterior via Pólya–Gamma block Gibbs.
- Parameters:¶
- draws : int¶
Number of post-warmup draws per chain.
- tune : int¶
Number of warmup (burn-in) draws per chain.
- chains : int¶
Number of independent chains.
- random_seed : int or None¶
Seed for reproducibility.
- thin : int¶
Keep every
thin-th draw. Default 1.- return_eta : bool¶
If True, store the full latent field η. Default False.
- n_jobs : int¶
Number of parallel chains. -1 = all CPUs.
- progressbar : bool¶
Show per-chain progress bars.
- chebyshev_degree : int, default 30¶
Chebyshev polynomial degree for η draw.
- Return type:¶
arviz.InferenceData
-
fit_approx(draws=
2000, n=10000, method='advi', random_seed=None, store_lambda=False, compute_log_likelihood=True, **fit_kwargs)[source]¶ Fit a variational approximation and return posterior draws.
- Parameters:¶
- draws : int, default 2000¶
Number of samples to draw from the fitted approximation.
- n : int, default 10000¶
Number of optimisation iterations for
pm.fit.- method : {"advi", "fullrank_advi"}, default "advi"¶
Variational inference family to fit.
- random_seed : int, optional¶
Seed for optimisation and posterior sampling.
- store_lambda : bool, default False¶
If True, keep the high-dimensional fitted mean
lambdain the posterior draws.- compute_log_likelihood : bool, default True¶
If True, compute pointwise log-likelihood after sampling and attach to the InferenceData (with Jacobian correction for SAR flow variants), enabling
az.loo/az.waic.- **fit_kwargs¶
Additional keyword arguments forwarded to
pm.fit.
- property inference_data : arviz.data.inference_data.InferenceData | None[source]¶
Return ArviZ InferenceData from the most recent fit, or None.
-
posterior_predictive(n_draws=
None, random_seed=None)[source]¶ Draw posterior-predictive samples
y_rep.For each (subsampled) posterior draw, simulates a new flow vector
y_repfrom the implied data-generating process by solving the sparse systemA(rho) y_rep = X β + ε(Gaussian) ory_rep ~ Poisson(exp(A^{-1} X β))(Poisson variants).
- property pymc_model : pymc.model.core.Model | None[source]¶
Return the PyMC model used for the most recent fit, or None.
- spatial_diagnostics()[source]¶
Run Bayesian LM specification tests for flow models.
Looks up the diagnostic suite registered for this model class and returns a tidy DataFrame with one row per test. See
bayespecon.models.base.SpatialModel.spatial_diagnostics()for the column schema.- Raises:¶
RuntimeError – If the model has not been fit yet.
-
spatial_diagnostics_decision(alpha=
0.05, format='graphviz')[source]¶ Return a model-selection decision from Bayesian LM test results.
Walks the flow decision tree using Bayesian p-values from
spatial_diagnostics()and recommends eitherOLSFlow(no spatial dependence detected) orSARFlow(at least one direction is significant).- Parameters:¶
- alpha : float, default 0.05¶
Significance level for the Bayesian p-values.
- format : {"graphviz", "ascii", "model"}, default "graphviz"¶
Output format.
"model"returns the recommended model name string."ascii"returns an indented box-drawing tree."graphviz"returns agraphviz.Digraph(with ASCII fallback if graphviz is not installed).
- Return type:¶
str or graphviz.Digraph
-
spatial_effects(draws=
None, return_posterior_samples=False, ci=0.95, mode='auto')[source]¶ Summarise posterior origin/destination/intra/network/total effects.
Wraps
_compute_spatial_effects_posterior()to produce a tidy DataFrame indexed by predictor with posterior means, credible-interval bounds, and Bayesian p-values for each effect type (origin, destination, intra, network, total). Following Thomas-Agnan & LeSage (2014, §83.5.2), when destination and origin design blocks differ the decomposition is reported separately for shocks applied to each side.- Parameters:¶
- draws : int, optional¶
Maximum number of posterior draws to use. Defaults to all.
- return_posterior_samples : bool, default False¶
If True, also return the underlying posterior-draw arrays.
- ci : float, default 0.95¶
Credible-interval coverage.
- mode : {"auto", "combined", "separate"}, default "auto"¶
Controls whether destination- and origin-side effects are summed or reported separately.
"auto"collapses to combined when the destination and origin design blocks are identical (self._symmetric_xo_xd) and reports both sides otherwise."combined"always sums;"separate"always reports both.
- Returns:¶
Long-format summary indexed by
(predictor, side, effect)wheresideis one of"combined","dest","orig".- Return type:¶
pandas.DataFrame, or (DataFrame, dict)