bayespecon.models.flow.SARNegBinFlowSeparableLatent¶
- class bayespecon.models.flow.SARNegBinFlowSeparableLatent(y, G, X, **kwargs)[source]¶
Bayesian separable SAR-NB flow model with Pólya–Gamma Gibbs sampler.
Same observable likelihood as
SARFlowSeparable(Gaussian) but with a Negative Binomial observation model and Pólya–Gamma data augmentation, yielding fully conjugate Gibbs updates for \(\eta\), \(\beta\), and \(\sigma^2\).The separability constraint \(\rho_w = -\rho_d \rho_o\) enables the Kronecker-structured sampler with \(O(n^3)\) matvec instead of \(O(n^6)\), making it practical for \(n \geq 50\).
- Parameters:¶
- y : array-like of int, shape (n, n) or (N,)¶
Observed non-negative integer flow counts.
- G : libpysal.graph.Graph¶
Row-standardised spatial graph on n units.
- X : np.ndarray or pandas.DataFrame, shape (N, p)¶
Full origin-destination design matrix with \(N = n^2\) rows.
- col_names : list of str, optional
Column labels for
X.- k : int, optional
Number of regional attribute columns.
- logdet_method : {"traces", "eigenvalue", "chebyshev"}, default "chebyshev"
Method for the log-determinant.
"traces"precomputes \(N \\times N\) flow traces (also used by Bayesian LM diagnostics);"eigenvalue"uses the Kronecker-factored \(n \\log|I_n - \\rho_d W| + n \\log|I_n - \\rho_o W|\);"chebyshev"is a polynomial approximation for large n.- priors : dict, optional
Override default priors. Supported keys:
beta_mu: float, default 0.0 — Normal prior mean forbeta.beta_sigma: float, default 1e6 — Normal prior std forbeta.sigma_sigma: float, default 10.0 — HalfNormal prior std forsigma.alpha_sigma: float, default 10.0 — HalfNormal prior std foralpha.rho_lower: float, default -0.999 — Lower bound for each \(\rho\).rho_upper: float, default 0.999 — Upper bound for each \(\rho\).
Notes
The sampler bypasses PyMC’s NUTS entirely. It produces an
arviz.InferenceDataobject compatible with all downstream diagnostics.\(\\alpha\) (NB dispersion) mixing can be slower than \(\\rho\) or \(\\beta\). Monitor ESS for \(\\alpha\) specifically and use longer runs if needed.
Methods
__init__(y, G, X, **kwargs)fit([draws, tune, chains, random_seed, ...])Sample posterior via Pólya–Gamma block Gibbs (separable).
fit_approx([draws, n, method, random_seed, ...])Fit a variational approximation and return posterior draws.
posterior_predictive([n_draws, random_seed])Draw posterior-predictive samples
y_rep.Run Bayesian LM specification tests for flow models.
spatial_diagnostics_decision([alpha, format])Return a model-selection decision from Bayesian LM test results.
spatial_effects([draws, ...])Summarise posterior origin/destination/intra/network/total effects.
summary([var_names])Return posterior summary table via ArviZ.
Attributes
Return the most recent PyMC variational approximation, if any.
Return ArviZ InferenceData from the most recent fit, or None.
Return the PyMC model used for the most recent fit, or None.
-
fit(draws=
2000, tune=1000, chains=4, random_seed=None, thin=1, return_eta=False, n_jobs=-1, progressbar=True, chebyshev_degree=30, **kwargs)[source]¶ Sample posterior via Pólya–Gamma block Gibbs (separable).
- Parameters:¶
- draws : int¶
Number of post-warmup draws per chain.
- tune : int¶
Number of warmup draws per chain.
- chains : int¶
Number of independent chains.
- random_seed : int or None¶
Seed for reproducibility.
- thin : int¶
Keep every
thin-th draw. Default 1.- return_eta : bool¶
If True, store the full latent field η. Default False.
- n_jobs : int¶
Number of parallel chains. -1 = all CPUs.
- progressbar : bool¶
Show per-chain progress bars.
- chebyshev_degree : int, default 30¶
Chebyshev polynomial degree for η draw.
- Return type:¶
arviz.InferenceData
-
fit_approx(draws=
2000, n=10000, method='advi', random_seed=None, store_lambda=False, compute_log_likelihood=True, **fit_kwargs)[source]¶ Fit a variational approximation and return posterior draws.
- Parameters:¶
- draws : int, default 2000¶
Number of samples to draw from the fitted approximation.
- n : int, default 10000¶
Number of optimisation iterations for
pm.fit.- method : {"advi", "fullrank_advi"}, default "advi"¶
Variational inference family to fit.
- random_seed : int, optional¶
Seed for optimisation and posterior sampling.
- store_lambda : bool, default False¶
If True, keep the high-dimensional fitted mean
lambdain the posterior draws.- compute_log_likelihood : bool, default True¶
If True, compute pointwise log-likelihood after sampling and attach to the InferenceData (with Jacobian correction for SAR flow variants), enabling
az.loo/az.waic.- **fit_kwargs¶
Additional keyword arguments forwarded to
pm.fit.
- property inference_data : arviz.data.inference_data.InferenceData | None[source]¶
Return ArviZ InferenceData from the most recent fit, or None.
-
posterior_predictive(n_draws=
None, random_seed=None)[source]¶ Draw posterior-predictive samples
y_rep.For each (subsampled) posterior draw, simulates a new flow vector
y_repfrom the implied data-generating process by solving the sparse systemA(rho) y_rep = X β + ε(Gaussian) ory_rep ~ Poisson(exp(A^{-1} X β))(Poisson variants).
- property pymc_model : pymc.model.core.Model | None[source]¶
Return the PyMC model used for the most recent fit, or None.
- spatial_diagnostics()[source]¶
Run Bayesian LM specification tests for flow models.
Looks up the diagnostic suite registered for this model class and returns a tidy DataFrame with one row per test. See
bayespecon.models.base.SpatialModel.spatial_diagnostics()for the column schema.- Raises:¶
RuntimeError – If the model has not been fit yet.
-
spatial_diagnostics_decision(alpha=
0.05, format='graphviz')[source]¶ Return a model-selection decision from Bayesian LM test results.
Walks the flow decision tree using Bayesian p-values from
spatial_diagnostics()and recommends eitherOLSFlow(no spatial dependence detected) orSARFlow(at least one direction is significant).- Parameters:¶
- alpha : float, default 0.05¶
Significance level for the Bayesian p-values.
- format : {"graphviz", "ascii", "model"}, default "graphviz"¶
Output format.
"model"returns the recommended model name string."ascii"returns an indented box-drawing tree."graphviz"returns agraphviz.Digraph(with ASCII fallback if graphviz is not installed).
- Return type:¶
str or graphviz.Digraph
-
spatial_effects(draws=
None, return_posterior_samples=False, ci=0.95, mode='auto')[source]¶ Summarise posterior origin/destination/intra/network/total effects.
Wraps
_compute_spatial_effects_posterior()to produce a tidy DataFrame indexed by predictor with posterior means, credible-interval bounds, and Bayesian p-values for each effect type (origin, destination, intra, network, total). Following Thomas-Agnan & LeSage (2014, §83.5.2), when destination and origin design blocks differ the decomposition is reported separately for shocks applied to each side.- Parameters:¶
- draws : int, optional¶
Maximum number of posterior draws to use. Defaults to all.
- return_posterior_samples : bool, default False¶
If True, also return the underlying posterior-draw arrays.
- ci : float, default 0.95¶
Credible-interval coverage.
- mode : {"auto", "combined", "separate"}, default "auto"¶
Controls whether destination- and origin-side effects are summed or reported separately.
"auto"collapses to combined when the destination and origin design blocks are identical (self._symmetric_xo_xd) and reports both sides otherwise."combined"always sums;"separate"always reports both.
- Returns:¶
Long-format summary indexed by
(predictor, side, effect)wheresideis one of"combined","dest","orig".- Return type:¶
pandas.DataFrame, or (DataFrame, dict)