bayespecon.models.SDM¶
-
class bayespecon.models.SDM(formula=
None, data=None, y=None, X=None, W=None, priors=None, logdet_method=None, robust=False, w_vars=None)[source]¶ Bayesian Spatial Durbin Model.
\[y = \rho Wy + X\beta_1 + WX\beta_2 + \varepsilon, \quad \varepsilon \sim N(0, \sigma^2 I)\]- Parameters:¶
- formula=
None¶ See
SpatialModel. Usew_varsto restrict which X columns are spatially lagged.- data=
None¶ See
SpatialModel. Usew_varsto restrict which X columns are spatially lagged.- y=
None¶ See
SpatialModel. Usew_varsto restrict which X columns are spatially lagged.- X=
None¶ See
SpatialModel. Usew_varsto restrict which X columns are spatially lagged.- W=
None¶ See
SpatialModel. Usew_varsto restrict which X columns are spatially lagged.- priors=
None¶ See
SpatialModel. Usew_varsto restrict which X columns are spatially lagged.- logdet_method=
None¶ See
SpatialModel. Usew_varsto restrict which X columns are spatially lagged.- w_vars=
None¶ See
SpatialModel. Usew_varsto restrict which X columns are spatially lagged.
- formula=
Notes
The
priorsdict supports the following keys:rho_lower, rho_upper(float, default -1, 1): Bounds for the Uniform prior on rho.beta_mu(float, default 0): Prior mean for beta.beta_sigma(float, default 1e6): Prior std for beta.sigma_sigma(float, default 10): Scale for HalfNormal prior on sigma.nu_lam(float, default 1/30): Rate for Exponential prior on \(\nu\) (only used whenrobust=True).
Robust regression
When
robust=True, the error distribution is changed from Normal to Student-t, yielding a model that is robust to heavy-tailed outliers:\[\varepsilon \sim t_\nu(0, \sigma^2 I)\]where \(\nu \sim \mathrm{TruncExp}(\lambda_\nu, \mathrm{lower}=2)\) with rate
nu_lam(default 1/30). The defaultnu_lam = 1/30gives a prior mean of approximately 30, favouring near-Normal tails. The lower bound of 2 ensures the variance exists.-
__init__(formula=
None, data=None, y=None, X=None, W=None, priors=None, logdet_method=None, robust=False, w_vars=None)[source]¶
Methods
__init__([formula, data, y, X, W, priors, ...])fit([draws, tune, chains, target_accept, ...])Draw samples from the posterior.
Return fitted values at posterior mean parameters.
Return residuals on the observed scale.
Run Bayesian LM specification tests and return a summary table.
spatial_diagnostics_decision([alpha])Return a model-selection decision from Bayesian LM test results.
spatial_effects([return_posterior_samples])Compute Bayesian inference for direct, indirect, and total impacts.
summary([var_names])Return posterior summary table.
Attributes
Return the ArviZ InferenceData from the most recent fit.
Return the PyMC model object built for the most recent fit.
-
fit(draws=
2000, tune=1000, chains=4, target_accept=0.9, random_seed=None, idata_kwargs=None, **sample_kwargs)[source]¶ Draw samples from the posterior. Accepts
idata_kwargsfor ArviZ compatibility.- Parameters:¶
:param Other parameters as in
SpatialModel.:Notes
The log-likelihood for the SDM model is:
\[\log p(y \mid \theta) = \sum_{i=1}^{n} \log \mathcal{N}(y_i \mid \mu_i, \sigma^2) + \log |I - \rho W |\]where \(\mu = \rho W y + Z \beta\) and \(Z = [X, WX]\).
As with the SAR model,
pm.Normalwithobservedauto-captures the Gaussian part, while the Jacobian \(\log |I - \rho W|\) is added viapm.Potentialand is absent from thelog_likelihoodgroup. To enable WAIC/LOO and Bayes factor comparison, we correct the pointwise log-likelihood after sampling:\[\ell_i = -\frac{1}{2}\left(\frac{y_i - \mu_i}{\sigma}\right)^2 + \frac{1}{n} \log |I - \rho W |\]
- property inference_data : arviz.data.inference_data.InferenceData | None[source]¶
Return the ArviZ InferenceData from the most recent fit.
- property pymc_model : pymc.model.core.Model | None[source]¶
Return the PyMC model object built for the most recent fit.
- spatial_diagnostics()[source]¶
Run Bayesian LM specification tests and return a summary table.
Iterates over the class-level
_spatial_diagnostics_testsregistry and calls each test function on this fitted model, collecting the results into a tidy DataFrame. The set of tests depends on the model type — for example, an OLS model runs LM-Lag, LM-Error, LM-SDM-Joint, and LM-SLX-Error-Joint, while an SAR model runs LM-Error, LM-WX, and Robust-LM-WX.Requires the model to have been fit (
.fit()called) and a spatial weights matrixWto have been supplied at construction time.- Returns:¶
DataFrame indexed by test name with columns:
Column
Description
statistic
Posterior mean of the LM statistic
median
Posterior median of the LM statistic
df
Degrees of freedom for the \(\chi^2\) reference
p_value
Bayesian p-value:
1 - chi2.cdf(mean, df)ci_lower
Lower bound of 95% credible interval (2.5%)
ci_upper
Upper bound of 95% credible interval (97.5%)
The DataFrame has
attrs["model_type"](class name) andattrs["n_draws"](total posterior draws) metadata.- Return type:¶
pandas.DataFrame
- Raises:¶
RuntimeError – If the model has not been fit yet.
ValueError – If no spatial weights matrix
Wwas supplied.
See also
spatial_diagnostics_decisionModel-selection decision based on the test results.
spatial_effectsPosterior inference for direct/indirect/total impacts.
Examples
>>> ols = OLS(formula="price ~ income + crime", data=df, W=w) >>> ols.fit() >>> ols.spatial_diagnostics() statistic median df p_value ci_lower ci_upper LM-Lag 3.21 2.98 1 0.073 0.12 8.54 LM-Error 5.67 5.34 1 0.017 0.34 12.10 LM-SDM-Joint 7.89 7.12 4 0.096 1.23 18.32 LM-SLX-Error-Joint 6.45 5.98 4 0.168 0.89 15.67
-
spatial_diagnostics_decision(alpha=
0.05)[source]¶ Return a model-selection decision from Bayesian LM test results.
Implements the decision tree from Koley and Bera [2024] (the Bayesian analogue of the classical
stge_kbprocedure in Anselin et al. [1996]). The decision logic depends on the current model type and the pattern of significant tests:From OLS (4-test decision tree):
If LM-SDM-Joint is significant → test Robust-LM-Lag-SDM and Robust-LM-Error-SDEM (requires re-fitting SLX first). If neither robust test is significant → OLS.
If LM-Lag is significant and LM-Error is not → SAR.
If LM-Error is significant and LM-Lag is not → SEM.
If both are significant → test Robust-Lag and Robust-Error. If Robust-Lag is significant → SAR; if Robust-Error → SEM; if neither → SARAR (both lag and error).
From SAR (3-test decision tree):
LM-Error significant → SARAR; LM-WX significant → SDM; Robust-LM-WX significant → SDM.
From SEM (2-test decision tree):
LM-Lag significant → SARAR; LM-WX significant → SDEM.
From SLX (4-test decision tree):
Robust-LM-Lag-SDM significant → SDM; Robust-LM-Error-SDEM significant → SDEM; both → MANSAR; neither → SLX.
From SDM: LM-Error significant → MANSAR; else SDM.
From SDEM: LM-Lag significant → MANSAR; else SDEM.
See also
spatial_diagnosticsCompute the Bayesian LM test statistics.
References
-
spatial_effects(return_posterior_samples=
False)[source]¶ Compute Bayesian inference for direct, indirect, and total impacts.
Computes impact measures for each posterior draw, then summarises the posterior distribution with means, 95% credible intervals, and Bayesian p-values. This is the fully Bayesian analog of the simulation-based approach in LeSage and Pace [2009] and the asymptotic variance formulas in Arbia et al. [2020].
Models without a spatial lag on y do not exhibit global feedback propagation through \((I-\rho W)^{-1}\). However, models with spatially lagged covariates (SLX, SDEM) can still have non-zero neighbour spillovers captured in the indirect term.
- Parameters:¶
- Returns:¶
If return_posterior_samples is
False(default), returns a DataFrame indexed by feature names with columns for posterior means, credible-interval bounds, and Bayesian p-values.If return_posterior_samples is
True, returns(DataFrame, dict)where the dict has keys"direct","indirect","total", each mapping to a(G, k)array of posterior draws.- Return type:¶