bayespecon.models.base.SpatialModel¶
-
class bayespecon.models.base.SpatialModel(formula=
None, data=None, y=None, X=None, W=None, priors=None, logdet_method=None, robust=False, w_vars=None, backend=None)[source]¶ Base class for Bayesian spatial regression models. Models follow the notation of [Anselin, 1988] and [LeSage and Pace, 2009]. The API supports both formula and matrix input modes.
- Parameters:¶
- formula : str, optional¶
Wilkinson-style formula string, e.g.
"price ~ poverty + rev_rating". If provided,datamust also be supplied. An intercept is included by default; suppress with"y ~ x - 1".- data : DataFrame or GeoDataFrame, optional¶
Data source when using formula mode.
- y : array-like, optional¶
Dependent variable. Required in matrix mode.
- X : array-like, optional¶
Predictor matrix. Required in matrix mode. If a DataFrame, column names are preserved for labelling.
- W : libpysal.graph.Graph or scipy.sparse matrix¶
Spatial weights matrix of shape
(n, n). Accepts alibpysal.graph.Graph(the modern libpysal graph API) or anyscipy.sparsematrix. The legacylibpysal.weights.Wobject is not accepted directly; passw.sparseto use the underlying sparse matrix, or convert withlibpysal.graph.Graph.from_W(w). W should be row-standardised; aUserWarningis raised if not.- priors : dict, optional¶
Override default priors. Keys depend on the model subclass; see each model’s docstring for supported keys.
- logdet_method : str¶
How to compute
log|I - rho*W|."eigenvalue"(default forn <= 2000) pre-computes W’s eigenvalues once and evaluates O(n) per step;"exact"uses symbolic pytensor det (slow forn > 500);"grid_dense"uses dense eigenvalue grid + cubic-spline interpolation (MATLAB-stylelndetfullfor dense W);"grid_sparse"uses sparse-LU grid + cubic-spline interpolation (lndetfullstyle for large sparse W);"sparse_spline"uses sparse-LU + spline on[max(rho_min, 0), rho_max](lndetintstyle);"grid_mc"uses Monte Carlo trace approximation (lndetmc);"grid_ilu"uses ILU-based approximation (lndeticholanalog);"chebyshev"(default forn > 2000) uses a Chebyshev polynomial approximation evaluated via Clenshaw’s algorithm.- robust : bool, default False¶
If True, use a Student-t error distribution instead of Normal, yielding a model that is robust to heavy-tailed outliers. When
robust=True, anu(degrees of freedom) parameter is added to the model with an \(\mathrm{Exp}(\lambda_\nu)\) prior (defaultnu_lam = 1/30, mean ≈ 30). Thenuprior can be controlled via thepriorsdict with keynu_lam.- w_vars : list of str, optional¶
Names of X columns to spatially lag. Only relevant for models that include
WXterms (SLX, SDM, SDEM and their panel/Tobit variants). By default all non-constant columns are lagged. Pass a subset to restrict which variables receive a spatial lag, e.g.w_vars=["income", "density"].
-
__init__(formula=
None, data=None, y=None, X=None, W=None, priors=None, logdet_method=None, robust=False, w_vars=None, backend=None)[source]¶
Methods
__init__([formula, data, y, X, W, priors, ...])fit([draws, tune, chains, target_accept, ...])Draw samples from the posterior.
Return fitted values at posterior mean parameters.
Return residuals on the observed (or transformed-panel) scale.
Run Bayesian LM specification tests and return a summary table.
spatial_diagnostics_decision([alpha, ...])Return a model-selection decision from Bayesian LM test results.
spatial_effects([return_posterior_samples])Compute Bayesian inference for direct, indirect, and total impacts.
summary([var_names])Return posterior summary table.
Attributes
Return the ArviZ InferenceData from the most recent fit.
Return the PyMC model object built for the most recent fit.
-
fit(draws=
2000, tune=1000, chains=4, target_accept=0.9, random_seed=None, **sample_kwargs)[source]¶ Draw samples from the posterior.
- Parameters:¶
- draws : int¶
Number of posterior samples per chain (after tuning).
- tune : int¶
Number of tuning (burn-in) steps per chain.
- chains : int¶
Number of parallel chains.
- target_accept : float¶
Target acceptance rate for NUTS.
- random_seed : int, optional¶
Seed for reproducibility.
- **sample_kwargs¶
Additional keyword arguments forwarded to
pm.sample. Passnuts_sampler="blackjax"(or"numpyro","nutpie") to select an alternative NUTS backend; defaults to PyMC’s built-in sampler.
- Return type:¶
arviz.InferenceData
- property inference_data : arviz.data.inference_data.InferenceData | None[source]¶
Return the ArviZ InferenceData from the most recent fit.
- property pymc_model : pymc.model.core.Model | None[source]¶
Return the PyMC model object built for the most recent fit.
- spatial_diagnostics()[source]¶
Run Bayesian LM specification tests and return a summary table.
Iterates over the class-level
_spatial_diagnostics_testsregistry and calls each test function on this fitted model, collecting the results into a tidy DataFrame. The set of tests depends on the model type.Requires the model to have been fit (
.fit()called). For cross-sectional models a spatial weights matrixWmust also have been supplied at construction time.- Returns:¶
DataFrame indexed by test name with columns
statistic(posterior mean),median,df(degrees of freedom for the \(\chi^2\) reference),p_value(Bayesian p-value1 - chi2.cdf(mean, df)), andci_lower/ci_upper(95% credible interval). The DataFrame carriesattrs["model_type"]andattrs["n_draws"]metadata.- Return type:¶
pandas.DataFrame
- Raises:¶
RuntimeError – If the model has not been fit yet.
ValueError – If a cross-sectional model was constructed without
W.
See also
spatial_diagnostics_decisionModel-selection decision based on the test results.
spatial_effectsPosterior inference for direct/indirect/total impacts.
-
spatial_diagnostics_decision(alpha=
0.05, format='graphviz', theme='default')[source]¶ Return a model-selection decision from Bayesian LM test results.
Implements the decision tree from Koley and Bera [2024] (the Bayesian analogue of the classical
stge_kbprocedure in Anselin et al. [1996]), adapted for panel models following Elhorst [2014] when invoked on a panel subclass. See the cross-sectional / panel-specific docstrings on the leaf classes for the full set of branches consulted.- Parameters:¶
- alpha : float, default 0.05¶
Significance level for the Bayesian p-values.
- format : {"graphviz", "ascii", "model"}, default "graphviz"¶
Output format.
"model"returns the recommended-model name string."ascii"returns an indented box-drawing rendering of the full decision tree with the chosen path highlighted."graphviz"returns agraphviz.Digraphobject that renders inline in Jupyter; if the optionalgraphvizpackage is not installed aUserWarningis issued and the ASCII rendering is returned instead.
- Returns:¶
Recommended model name when
format="model", an ASCII tree string whenformat="ascii", or agraphviz.Digraphwhenformat="graphviz"(with ASCII fallback on missing dep).- Return type:¶
str or graphviz.Digraph
See also
spatial_diagnosticsCompute the Bayesian LM test statistics.
-
spatial_effects(return_posterior_samples=
False)[source]¶ Compute Bayesian inference for direct, indirect, and total impacts.
Computes impact measures for each posterior draw, then summarises the posterior distribution with means, 95% credible intervals, and Bayesian p-values. This is the fully Bayesian analog of the simulation-based approach in LeSage and Pace [2009] and the asymptotic variance formulas in Arbia et al. [2020].
Models without a spatial lag on y do not exhibit global feedback propagation through \((I-\rho W)^{-1}\). However, models with spatially lagged covariates (SLX, SDEM) can still have non-zero neighbour spillovers captured in the indirect term.
- Parameters:¶
- Returns:¶
If return_posterior_samples is
False(default), returns a DataFrame indexed by feature names with columns for posterior means, credible-interval bounds, and Bayesian p-values.If return_posterior_samples is
True, returns(DataFrame, dict)where the dict has keys"direct","indirect","total", each mapping to a(G, k)array of posterior draws.- Return type:¶