bayespecon.models.flow.FlowModel¶

class bayespecon.models.flow.FlowModel(y, G, X, col_names=None, k=None, priors=None, logdet_method='traces', restrict_positive=True, miter=30, titer=800, trace_riter=50, trace_seed=None, symmetric_xo_xd=None, backend=None)[source]¶

Abstract base class for Bayesian spatial flow regression models.

Unlike SpatialModel, this class works with an \(N = n^2\) vectorised response and three Kronecker-product weight matrices constructed from a single n×n graph. The API mirrors SpatialModel (fit, summary, inference_data) but the internals are tailored to the flow structure.

The model accepts a full O-D design matrix X of shape (n², p), typically produced by flow_design_matrix() or flow_design_matrix_with_orig().

Parameters:¶

y : array-like, shape (n, n) or (N,)¶: Observed O-D flow matrix (or its vec-form). Must be a square matrix or a flat vector of length \(N = n^2\).
G : libpysal.graph.Graph¶: Row-standardised spatial graph on n units. Validated by _validate_graph().
X : np.ndarray or pandas.DataFrame, shape (N, p)¶: Full origin-destination design matrix with \(N = n^2\) rows. This is typically produced by flow_design_matrix() or flow_design_matrix_with_orig(). If a DataFrame, column names are inferred automatically.
col_names : list[str], optional¶: Column labels for X. If X is a DataFrame, column names are inferred automatically. Defaults to ["x0", "x1", ...].
k : int, optional¶: Number of regional attribute columns in the design matrix (i.e., the number of destination/origin variable pairs). When the design matrix follows the standard LeSage layout [intercept, intra_indicator, dest_*, orig_*, intra_*, (dist)], k can be inferred from the column names. Provide k explicitly if column names do not follow the dest_*/orig_* convention.
priors : dict, optional¶: Override default priors. Supported keys vary by subclass.
logdet_method : str, default "traces"¶: How to compute \(\log|I_N - \rho_d W_d - \rho_o W_o - \rho_w W_w|\). "traces" uses Barry-Pace stochastic traces with the multinomial Kronecker identity (the default and recommended method). "eigenvalue", "chebyshev", "trace_mc" (separable flow models only) use the Kronecker eigenvalue factorisation.
restrict_positive : bool, default True¶: If True, use a pm.Dirichlet prior that restricts \(\rho_d, \rho_o, \rho_w \geq 0\) with \(\rho_d + \rho_o + \rho_w \leq 1\). This is NUTS-safe and appropriate for most flow applications. If False, use three independent pm.Uniform(-1, 1) priors with a differentiable quadratic-wall stability potential.
miter : int, default 30¶: Trace polynomial order for the log-determinant (only used when logdet_method="traces"). Higher values improve accuracy at the cost of more precomputation.
titer : int, default 800¶: Geometric tail cutoff for the log-determinant series.
trace_riter : int, default 50¶: Number of Monte Carlo probes for trace estimation.
trace_seed : int, optional¶: Random seed for trace estimation reproducibility.
symmetric_xo_xd : bool, optional¶: If None (default), the destination and origin design blocks are compared and symmetry is auto-detected. Set explicitly to override the heuristic — for example, when using flow_design_matrix_with_orig() with distinct attributes for the origin and destination sides. Controls the default behaviour of spatial_effects() when mode="auto".

__init__(y, G, X, col_names=None, k=None, priors=None, logdet_method='traces', restrict_positive=True, miter=30, titer=800, trace_riter=50, trace_seed=None, symmetric_xo_xd=None, backend=None)[source]¶

Methods

`__init__`(y, G, X[, col_names, k, priors, ...])
`fit`([draws, tune, chains, target_accept, ...])	Draw samples from the posterior.
`fit_approx`([draws, n, method, random_seed, ...])	Fit a variational approximation and return posterior draws.
`posterior_predictive`([n_draws, random_seed, ...])	Draw posterior-predictive samples `y_rep`.
`spatial_diagnostics`()	Run Bayesian LM specification tests for flow models.
`spatial_diagnostics_decision`([alpha, ...])	Return a model-selection decision from Bayesian LM test results.
`spatial_effects`([draws, ...])	Summarise posterior origin/destination/intra/network/total effects.
`summary`([var_names])	Return posterior summary table via ArviZ.

Attributes

`approximation`	Return the most recent PyMC variational approximation, if any.
`inference_data`	Return ArviZ InferenceData from the most recent fit, or None.
`pymc_model`	Return the PyMC model used for the most recent fit, or None.

property approximation[source]¶: Return the most recent PyMC variational approximation, if any.

fit(draws=2000, tune=1000, chains=4, target_accept=0.9, random_seed=None, store_lambda=False, idata_kwargs=None, **sample_kwargs)[source]¶

Draw samples from the posterior.

Parameters:¶

draws : int, default 2000¶: Number of posterior samples per chain (after tuning).
tune : int, default 1000¶: Number of tuning (warm-up) steps per chain.
chains : int, default 4¶: Number of parallel chains.
target_accept : float, default 0.9¶: Target acceptance rate for NUTS.
random_seed : int, optional¶: Seed for reproducibility.
store_lambda : bool, default False¶: If True, include the high-dimensional fitted mean lambda in the stored posterior. Leaving this False reduces memory and conversion overhead for Poisson flow models.
idata_kwargs : dict, optional¶: Forwarded to pm.sample. Defaults to {"log_likelihood": True} so that az.loo / az.waic / az.compare work out of the box; for SAR flow variants the captured Gaussian log-likelihood is post-processed to add the Jacobian contribution from log|I_N - rho_d W_d - rho_o W_o - rho_w W_w|.
**sample_kwargs¶: Additional keyword arguments forwarded to pm.sample.

Return type:¶

arviz.InferenceData

fit_approx(draws=2000, n=10000, method='advi', random_seed=None, store_lambda=False, compute_log_likelihood=True, **fit_kwargs)[source]¶

Fit a variational approximation and return posterior draws.

Parameters:¶

draws : int, default 2000¶: Number of samples to draw from the fitted approximation.
n : int, default 10000¶: Number of optimisation iterations for pm.fit.
method : {"advi", "fullrank_advi"}, default "advi"¶: Variational inference family to fit.
random_seed : int, optional¶: Seed for optimisation and posterior sampling.
store_lambda : bool, default False¶: If True, keep the high-dimensional fitted mean lambda in the posterior draws.
compute_log_likelihood : bool, default True¶: If True, compute pointwise log-likelihood after sampling and attach to the InferenceData (with Jacobian correction for SAR flow variants), enabling az.loo / az.waic.
**fit_kwargs¶: Additional keyword arguments forwarded to pm.fit.

property inference_data : arviz.data.inference_data.InferenceData | None[source]¶: Return ArviZ InferenceData from the most recent fit, or None.

posterior_predictive(n_draws=None, random_seed=None, parallel=-1)[source]¶

Draw posterior-predictive samples y_rep.

For each (subsampled) posterior draw, simulates a new flow vector y_rep from the implied data-generating process by solving the sparse system A(rho) y_rep = X β + ε (Gaussian) or y_rep ~ Poisson(exp(A^{-1} X β)) (Poisson variants).

Parameters:¶

n_draws : int, optional¶: Number of posterior draws to use. Defaults to all available.
random_seed : int, optional¶: Seed for the noise/Poisson sampler.
parallel : int or None, default -1¶: Number of worker threads for the per-draw loop. -1 uses os.cpu_count(); None/0/1 forces sequential execution. Reproducibility under a fixed random_seed is preserved across worker counts via SeedSequence.spawn.

Returns:¶

Array of shape (n_draws, N) with posterior-predictive flows.

Return type:¶

np.ndarray

property pymc_model : pymc.model.core.Model | None[source]¶: Return the PyMC model used for the most recent fit, or None.

spatial_diagnostics()[source]¶

Run Bayesian LM specification tests for flow models.

Iterates over the class-level _spatial_diagnostics_tests registry and returns a tidy DataFrame with one row per test. See bayespecon.models.base.SpatialModel.spatial_diagnostics() for the column schema.

Raises:¶: RuntimeError – If the model has not been fit yet.

spatial_diagnostics_decision(alpha=0.05, format='graphviz', theme='default')[source]¶

Return a model-selection decision from Bayesian LM test results.

Walks the flow decision tree using Bayesian p-values from spatial_diagnostics() and recommends either OLSFlow (no spatial dependence detected) or SARFlow (at least one direction is significant).

Parameters:¶

alpha : float, default 0.05¶: Significance level for the Bayesian p-values.
format : {"graphviz", "ascii", "model"}, default "graphviz"¶: Output format. "model" returns the recommended model name string. "ascii" returns an indented box-drawing tree. "graphviz" returns a graphviz.Digraph (with ASCII fallback if graphviz is not installed).

Return type:¶

str or graphviz.Digraph

spatial_effects(draws=None, return_posterior_samples=False, ci=0.95, mode='auto', parallel=-1)[source]¶

Summarise posterior origin/destination/intra/network/total effects.

Wraps _compute_spatial_effects_posterior() to produce a tidy DataFrame indexed by predictor with posterior means, credible-interval bounds, and Bayesian p-values for each effect type (origin, destination, intra, network, total). Following Thomas-Agnan & LeSage (2014, §83.5.2), when destination and origin design blocks differ the decomposition is reported separately for shocks applied to each side.

Parameters:¶

draws : int, optional¶: Maximum number of posterior draws to use. Defaults to all.
return_posterior_samples : bool, default False¶: If True, also return the underlying posterior-draw arrays.
ci : float, default 0.95¶: Credible-interval coverage.
mode : {"auto", "combined", "separate"}, default "auto"¶: Controls whether destination- and origin-side effects are summed or reported separately. "auto" collapses to combined when the destination and origin design blocks are identical (self._symmetric_xo_xd) and reports both sides otherwise. "combined" always sums; "separate" always reports both.
parallel : int or None, default -1¶: Number of worker threads for the per-draw effects loop. -1 uses os.cpu_count(); None/0/1 forces sequential execution. Ignored by closed-form (OLSFlow, SEMFlow) variants.

Returns:¶

Long-format summary indexed by (predictor, side, effect) where side is one of "combined", "dest", "orig".

Return type:¶

pandas.DataFrame, or (DataFrame, dict)

summary(var_names=None, **kwargs)[source]¶

Return posterior summary table via ArviZ.

Parameters:¶

var_names : list, optional¶: Variable names to include. Defaults to all parameters.
**kwargs¶: Additional keyword arguments forwarded to az.summary.

Return type:¶

pandas.DataFrame