bayespecon.models.SDM¶

class bayespecon.models.SDM(formula=None, data=None, y=None, X=None, W=None, priors=None, logdet_method=None, robust=False, w_vars=None, backend=None)[source]¶

Bayesian Spatial Durbin Model.

Combines a spatial lag of \(y\) with spatial lags of the regressors \(X\):

\[y = \rho Wy + X\beta + WX\theta + \varepsilon, \quad \varepsilon \sim N(0, \sigma^2 I).\]

The sampled coefficient vector stacks the local and lagged-regressor blocks as \([\beta, \theta]\). The likelihood includes the spatial Jacobian \(\log|I - \rho W|\).

Parameters:¶

formula : str, optional¶

Wilkinson-style formula, e.g. "y ~ x1 + x2". Requires data. Intercept is included by default; suppress with "y ~ x - 1".

data : pandas.DataFrame or geopandas.GeoDataFrame, optional¶

Data source for formula mode.

y : array-like, optional¶

Dependent variable of shape (n,). Required in matrix mode.

X : array-like or pandas.DataFrame, optional¶

Design matrix. Required in matrix mode. DataFrame columns are preserved as feature names.

W : libpysal.graph.Graph or scipy.sparse matrix¶

Spatial weights of shape (n, n). Accepts a libpysal.graph.Graph or any scipy.sparse matrix. The legacy libpysal.weights.W object is not accepted; pass w.sparse or libpysal.graph.Graph.from_W(w). Should be row-standardised; a UserWarning is raised otherwise.

priors : dict, optional¶

Override default priors. Supported keys:

rho_lower (float, default -1.0): Lower bound of the Uniform prior on \(\rho\).
rho_upper (float, default 1.0): Upper bound of the Uniform prior on \(\rho\).
beta_mu (float, default 0.0): Normal prior mean for \([\beta, \theta]\).
beta_sigma (float, default 1e6): Normal prior std for \([\beta, \theta]\).
sigma_sigma (float, default 10.0): HalfNormal prior std for \(\sigma\).
nu_lam (float, default 1/30): Rate of TruncExp(lower=2) prior on \(\nu\) (only used when robust=True).

logdet_method : str, optional¶

How to compute \(\log|I - \rho W|\). None (default) auto-selects "eigenvalue" for n <= 2000 else "chebyshev". Other options: "exact", "grid_dense", "grid_sparse", "sparse_spline", "grid_mc", "grid_ilu".

robust : bool, default False¶

If True, replace the Normal error with Student-t. See Robust regression below.

w_vars : list of str, optional¶

Names of X columns to spatially lag. By default all non-constant columns are lagged. Pass a subset to restrict which variables receive a spatial lag, e.g. w_vars=["income", "density"]. SDM requires at least one WX column; if filtering eliminates all of them a ValueError is raised.

Notes

Direct, indirect and total effects of \(X\) on \(y\) incorporate both the local and lagged-X blocks via the spatial multiplier \((I - \rho W)^{-1}\) and are reported by spatial_effects().

Robust regression

When robust=True, the error distribution is changed from Normal to Student-t:

\[\varepsilon \sim t_\nu(0, \sigma^2 I)\]

where \(\nu \sim \mathrm{TruncExp}(\lambda_\nu, \mathrm{lower}=2)\) with rate nu_lam (default 1/30, mean ≈ 30).

__init__(formula=None, data=None, y=None, X=None, W=None, priors=None, logdet_method=None, robust=False, w_vars=None, backend=None)[source]¶

Methods

`__init__`([formula, data, y, X, W, priors, ...])
`fit`([draws, tune, chains, target_accept, ...])	Draw samples from the posterior.
`fitted_values`()	Return fitted values at posterior mean parameters.
`residuals`()	Return residuals on the observed (or transformed-panel) scale.
`spatial_diagnostics`()	Run Bayesian LM specification tests and return a summary table.
`spatial_diagnostics_decision`([alpha, ...])	Return a model-selection decision from Bayesian LM test results.
`spatial_effects`([return_posterior_samples])	Compute Bayesian inference for direct, indirect, and total impacts.
`summary`([var_names])	Return posterior summary table.

Attributes

`inference_data`	Return the ArviZ InferenceData from the most recent fit.
`pymc_model`	Return the PyMC model object built for the most recent fit.

fit(draws=2000, tune=1000, chains=4, target_accept=0.9, random_seed=None, idata_kwargs=None, **sample_kwargs)[source]¶

Draw samples from the posterior. Accepts idata_kwargs for ArviZ compatibility.

Parameters:¶

idata_kwargs : dict, optional¶: Passed to pm.sample for InferenceData creation. If contains log_likelihood: True, the complete pointwise log-likelihood (including the Jacobian correction) is attached to the output.

:param Other parameters as in SpatialModel.:

Notes

The log-likelihood for the SDM model is:

\[\log p(y \mid \theta) = \sum_{i=1}^{n} \log \mathcal{N}(y_i \mid \mu_i, \sigma^2) + \log |I - \rho W |\]

where \(\mu = \rho W y + Z \beta\) and \(Z = [X, WX]\).

As with the SAR model, pm.Normal with observed auto-captures the Gaussian part, while the Jacobian \(\log |I - \rho W|\) is added via pm.Potential and is absent from the log_likelihood group. To enable WAIC/LOO and Bayes factor comparison, we correct the pointwise log-likelihood after sampling:

\[\ell_i = -\frac{1}{2}\left(\frac{y_i - \mu_i}{\sigma}\right)^2 + \frac{1}{n} \log |I - \rho W |\]

fitted_values()[source]¶: Return fitted values at posterior mean parameters.

property inference_data : arviz.data.inference_data.InferenceData | None[source]¶: Return the ArviZ InferenceData from the most recent fit.

property pymc_model : pymc.model.core.Model | None[source]¶: Return the PyMC model object built for the most recent fit.

residuals()[source]¶: Return residuals on the observed (or transformed-panel) scale.

spatial_diagnostics()[source]¶

Run Bayesian LM specification tests and return a summary table.

Iterates over the class-level _spatial_diagnostics_tests registry and calls each test function on this fitted model, collecting the results into a tidy DataFrame. The set of tests depends on the model type.

Requires the model to have been fit (.fit() called). For cross-sectional models a spatial weights matrix W must also have been supplied at construction time.

Returns:¶

DataFrame indexed by test name with columns statistic (posterior mean), median, df (degrees of freedom for the \(\chi^2\) reference), p_value (Bayesian p-value 1 - chi2.cdf(mean, df)), and ci_lower / ci_upper (95% credible interval). The DataFrame carries attrs["model_type"] and attrs["n_draws"] metadata.

Return type:¶

pandas.DataFrame

Raises:¶

RuntimeError – If the model has not been fit yet.
ValueError – If a cross-sectional model was constructed without W.