bayespecon.SARNegativeBinomial

class bayespecon.SARNegativeBinomial(*args, **kwargs)[source]

Bayesian SAR model with a Negative Binomial likelihood.

Parameters:
formula

Same interface and semantics as bayespecon.models.sar.SAR.

data

Same interface and semantics as bayespecon.models.sar.SAR.

y

Same interface and semantics as bayespecon.models.sar.SAR.

X

Same interface and semantics as bayespecon.models.sar.SAR.

W

Same interface and semantics as bayespecon.models.sar.SAR.

priors

Same interface and semantics as bayespecon.models.sar.SAR.

logdet_method

Same interface and semantics as bayespecon.models.sar.SAR.

robust : bool, default False

Not supported for count outcomes. If True, NotImplementedError is raised.

Notes

The model uses the SAR structural form with a non-centred parameterisation:

\[\eta = (I - \rho W)^{-1}(X \beta + \sigma z), \quad z \sim N(0, I),\]

which is equivalent to the centred form \(\eta \sim N((I - \rho W)^{-1} X \beta,\; \sigma^2 (I - \rho W)^{-1}(I - \rho W')^{-1})\). This matches the structural form used by SARNegBinLatent, enabling fair comparison between NUTS and Gibbs samplers.

No spatial Jacobian \(\log|I - \rho W|\) is needed because the change-of-variables Jacobian cancels with the multivariate-normal normalisation constant in the non-centred parameterisation.

Overdispersion is captured by the NB2 parameter alpha, and spatial noise in the latent field by sigma.

__init__(*args, **kwargs)[source]

Methods

__init__(*args, **kwargs)

fit([draws, tune, chains, target_accept, ...])

Sample posterior.

fitted_values()

Return fitted values at posterior mean parameters.

residuals()

Return residuals on the observed scale.

spatial_diagnostics()

Run Bayesian LM specification tests and return a summary table.

spatial_diagnostics_decision([alpha, format])

Return a model-selection decision from Bayesian LM test results.

spatial_effects([return_posterior_samples, ...])

Compute Bayesian inference for direct, indirect, and total impacts.

summary([var_names])

Return posterior summary table.

Attributes

inference_data

Return the ArviZ InferenceData from the most recent fit.

pymc_model

Return the PyMC model object built for the most recent fit.

fit(draws=2000, tune=1000, chains=4, target_accept=0.9, random_seed=None, idata_kwargs=None, **sample_kwargs)[source]

Sample posterior.

The Negative Binomial log-likelihood is auto-captured by PyMC. No Jacobian correction is needed because the non-centred parameterisation’s change-of-variables Jacobian cancels with the multivariate-normal normalisation constant.

fitted_values()[source]

Return fitted values at posterior mean parameters.

Returns:

Posterior-mean fitted values.

Return type:

np.ndarray

property inference_data : arviz.data.inference_data.InferenceData | None[source]

Return the ArviZ InferenceData from the most recent fit.

Returns:

The inference data object, or None if the model has not been fit yet.

Return type:

arviz.InferenceData or None

property pymc_model : pymc.model.core.Model | None[source]

Return the PyMC model object built for the most recent fit.

For Gibbs-fitted models the PyMC model is not constructed during sampling; it is built lazily on first access so that downstream consumers (e.g. bridge sampling for marginal likelihoods) can evaluate logp and the prior under the same model definition used by the NUTS path.

Returns:

The model object used by fit(), or None if the instance has not been fit yet.

Return type:

pymc.Model or None

residuals()[source]

Return residuals on the observed scale.

Returns:

Residual vector y - fitted_values.

Return type:

np.ndarray

spatial_diagnostics()[source]

Run Bayesian LM specification tests and return a summary table.

Looks up the diagnostic suite registered for this model class and calls each test function on this fitted model, collecting the results into a tidy DataFrame. The set of tests depends on the model type — for example, an OLS model runs LM-Lag, LM-Error, LM-SDM-Joint, and LM-SLX-Error-Joint, while an SAR model runs LM-Error, LM-WX, and Robust-LM-WX.

Requires the model to have been fit (.fit() called) and a spatial weights matrix W to have been supplied at construction time.

Returns:

DataFrame indexed by test name with columns:

Column

Description

statistic

Posterior mean of the LM statistic

median

Posterior median of the LM statistic

df

Degrees of freedom for the \(\chi^2\) reference

p_value

Bayesian p-value: 1 - chi2.cdf(mean, df)

ci_lower

Lower bound of 95% credible interval (2.5%)

ci_upper

Upper bound of 95% credible interval (97.5%)

The DataFrame has attrs["model_type"] (class name) and attrs["n_draws"] (total posterior draws) metadata.

Return type:

pandas.DataFrame

Raises:
  • RuntimeError – If the model has not been fit yet.

  • ValueError – If no spatial weights matrix W was supplied.

See also

spatial_diagnostics_decision

Model-selection decision based on the test results.

spatial_effects

Posterior inference for direct/indirect/total impacts.

Examples

>>> ols = OLS(formula="price ~ income + crime", data=df, W=w)
>>> ols.fit()
>>> ols.spatial_diagnostics()
                 statistic  median  df  p_value  ci_lower  ci_upper
LM-Lag                3.21    2.98   1    0.073      0.12      8.54
LM-Error              5.67    5.34   1    0.017      0.34     12.10
LM-SDM-Joint          7.89    7.12   4    0.096      1.23     18.32
LM-SLX-Error-Joint    6.45    5.98   4    0.168      0.89     15.67
spatial_diagnostics_decision(alpha=0.05, format='graphviz')[source]

Return a model-selection decision from Bayesian LM test results.

Implements the decision tree from Koley and Bera [2024] (the Bayesian analogue of the classical stge_kb procedure in Anselin et al. [1996]). The decision logic depends on the current model type and the pattern of significant tests:

From OLS (6-test decision tree):

  1. If only LM-Lag is significant → SAR.

  2. If only LM-Error is significant → SEM.

  3. If both are significant → use the Anselin–Florax / Koley–Bera robust pair: Robust-LM-Lag → SAR, Robust-LM-Error → SEM, both → SARAR. If neither robust test is significant, fall back to the lower raw p-value.

  4. If neither naive test is significant → OLS.

From SAR (3-test decision tree):

  • LM-Error significant → SARAR; LM-WX significant → SDM; Robust-LM-WX significant → SDM.

From SEM (2-test decision tree):

  • LM-Lag significant → SARAR; LM-WX significant → SDEM.

From SLX (4-test decision tree):

  • Robust-LM-Lag-SDM significant → SDM; Robust-LM-Error-SDEM significant → SDEM; both → MANSAR; neither → SLX.

From SDM: LM-Error-SDM significant → MANSAR; else SDM.

From SDEM: LM-Lag-SDEM significant → MANSAR; else SDEM.

Parameters:
alpha : float, default 0.05

Significance level for the Bayesian p-values.

format : {"graphviz", "ascii", "model"}, default "graphviz"

Output format. "model" returns the recommended-model name string. "ascii" returns an indented box-drawing rendering of the full decision tree with the chosen path highlighted. "graphviz" returns a graphviz.Digraph object that renders inline in Jupyter; if the optional graphviz package is not installed a UserWarning is issued and the ASCII rendering is returned instead.

Returns:

Recommended model name when format="model", an ASCII tree string when format="ascii", or a graphviz.Digraph when format="graphviz" (with ASCII fallback on missing dep).

Return type:

str or graphviz.Digraph

See also

spatial_diagnostics

Compute the Bayesian LM test statistics.

References

Koley and Bera [2024], Anselin et al. [1996]

spatial_effects(return_posterior_samples=False, scale='logmean', method='auto')[source]

Compute Bayesian inference for direct, indirect, and total impacts.

Parameters:
return_posterior_samples : bool, optional

If True, also return the posterior draws for each effect type.

scale : {"logmean", "count"}, default "logmean"

Scale on which impacts are reported.

"logmean" returns the current default impacts on the linear predictor scale \(\log \mu\).

"count" returns impacts on the expected-count scale \(\mu = \exp(\eta)\). This is exact but more expensive because it requires the diagonal of the spatial multiplier for each posterior draw.

method : {"auto", "eigen", "sparse"}, default "auto"

Only used when scale="count". "eigen" materialises the eigendecomposition of \(W\) (fast for small \(n\) but O(n³) memory/time); "sparse" uses one sparse LU per draw plus a Hutchinson diagonal estimator; "auto" picks sparse when \(n\) exceeds _COUNT_EFFECTS_EIGEN_MAX_N (default 2000).

summary(var_names=None, **kwargs)[source]

Return posterior summary table.

Parameters:
var_names : list, optional

Variable names to include in the summary.

**kwargs

Additional arguments passed to arviz.summary().

Returns:

Posterior summary statistics.

Return type:

pandas.DataFrame