bayespecon.OLSPanelFE¶
-
class bayespecon.OLSPanelFE(formula=
None, data=None, y=None, X=None, W=None, unit_col=None, time_col=None, N=None, T=None, model=0, priors=None, logdet_method=None, robust=False, w_vars=None, backend=None, trace_estimator='hutchpp', trace_k=None)[source]¶ Bayesian pooled and fixed-effects linear panel regression.
Implements the Gaussian panel model
\[y_{it} = x_{it}'\beta + \alpha_i + \tau_t + \varepsilon_{it}, \qquad \varepsilon_{it} \sim \mathcal{N}(0, \sigma^2),\]where the included effects depend on
model:0pooled,1unit effects,2time effects,3two-way effects. The within transformation is handled bySpatialPanelModelbefore the likelihood is evaluated.- Parameters:¶
- formula : str, optional¶
Wilkinson-style formula, e.g.
"y ~ x1 + x2". Requiresdata,unit_col, andtime_col.- data : pandas.DataFrame, optional¶
Long-format panel data when using formula mode.
- y : array-like, optional¶
Stacked response of shape
(N*T,)in unit-major order. Required in matrix mode.- X : array-like or pandas.DataFrame, optional¶
Stacked design matrix of shape
(N*T, k). Required in matrix mode. DataFrame columns are preserved as feature names.- W : libpysal.graph.Graph or scipy.sparse matrix¶
Spatial weights of shape
(N, N)(preferred) or(N*T, N*T)block-diagonal. Accepted for API consistency with the other panel models but does not enter the OLS likelihood; required if downstream Bayesian LM diagnostics will be run.- unit_col : str, optional¶
Column in
dataidentifying the cross-sectional unit. Required in formula mode.- time_col : str, optional¶
Column in
dataidentifying the time period. Required in formula mode.- N : int, optional¶
Number of cross-sectional units. Required in matrix mode if not inferable.
- T : int, optional¶
Number of time periods. Required in matrix mode if not inferable.
- model : int, default 0¶
Fixed-effects specification:
0pooled,1unit FE,2time FE,3two-way FE.- priors : dict, optional¶
Override default priors. Supported keys:
beta_mu(array, default Gelman 2008): Normal prior mean for \(\beta\).beta_sigma(array, default Gelman 2008): Normal prior std for \(\beta\).sigma2_alpha(float, default 2.0): InverseGamma shape for \(\sigma^2\).sigma2_beta(float, defaultVar(y)): InverseGamma scale for \(\sigma^2\).nu_lam(float, default 1/30): Rate of TruncExp(lower=2) prior on \(\nu\) (only used whenrobust=True).
- logdet_method : str, optional¶
Accepted for API consistency; unused in OLSPanelFE (no spatial Jacobian).
- robust : bool, default False¶
If True, replace the Normal error with Student-t. See Robust regression below.
Notes
This is the aspatial baseline for panel LM diagnostics and panel model comparison. The spatial weights object
Wis accepted for API consistency but does not enter the likelihood.Robust regression
When
robust=True, the error distribution is changed from Normal to Student-t, yielding a model that is robust to heavy-tailed outliers:\[\varepsilon_{it} \sim t_\nu(0, \sigma^2)\]where \(\nu \sim \mathrm{TruncExp}(\lambda_\nu, \mathrm{lower}=2)\) with rate
nu_lam(default 1/30). The defaultnu_lam = 1/30gives a prior mean of approximately 30, favouring near-Normal tails. The lower bound of 2 ensures the variance exists.-
__init__(formula=
None, data=None, y=None, X=None, W=None, unit_col=None, time_col=None, N=None, T=None, model=0, priors=None, logdet_method=None, robust=False, w_vars=None, backend=None, trace_estimator='hutchpp', trace_k=None)[source]¶
Methods
__init__([formula, data, y, X, W, unit_col, ...])fit([draws, tune, chains, target_accept, ...])Sample the posterior for the panel model.
Return fitted values at posterior mean parameters.
Return transformed residuals
y - fitted.Run Bayesian LM specification tests and return a summary table.
spatial_diagnostics_decision([alpha, format])Return a model-selection decision from Bayesian LM test results.
spatial_effects([return_posterior_samples])Compute Bayesian inference for direct, indirect, and total impacts.
summary([var_names])Return posterior summary table.
Attributes
Return the ArviZ InferenceData from the most recent fit.
Return the PyMC model object built for the most recent fit.
-
fit(draws=
2000, tune=1000, chains=4, target_accept=0.9, random_seed=None, progressbar=True, **sample_kwargs)[source]¶ Sample the posterior for the panel model.
- Parameters:¶
- draws : int, default=2000¶
Number of post-tuning draws per chain.
- tune : int, default=1000¶
Number of tuning draws per chain.
- chains : int, default=4¶
Number of chains.
- target_accept : float, default=0.9¶
NUTS target acceptance probability.
- random_seed : int, optional¶
Random seed used by PyMC.
- progressbar : bool, default True¶
Show progress bar during sampling.
- **sample_kwargs¶
Extra keyword arguments forwarded to
pymc.sample(). Passnuts_sampler="blackjax"(or"numpyro","nutpie") to select an alternative NUTS backend; defaults to PyMC’s built-in sampler.
- Returns:¶
Posterior samples and diagnostics.
- Return type:¶
arviz.InferenceData
- property inference_data : arviz.data.inference_data.InferenceData | None[source]¶
Return the ArviZ InferenceData from the most recent fit.
- property pymc_model : pymc.model.core.Model | None[source]¶
Return the PyMC model object built for the most recent fit.
- spatial_diagnostics()[source]¶
Run Bayesian LM specification tests and return a summary table.
Looks up the diagnostic suite registered for this model class and calls each test function on this fitted model, collecting the results into a tidy DataFrame. The set of tests depends on the model type — for example, an OLSPanelFE model runs Panel-LM-Lag, Panel-LM-Error, Panel-LM-SDM-Joint, and Panel-LM-SLX-Error-Joint.
Requires the model to have been fit (
.fit()called) and a spatial weights matrixWto have been supplied at construction time.- Returns:¶
DataFrame indexed by test name with columns:
Column
Description
statistic
Posterior mean of the LM statistic
median
Posterior median of the LM statistic
df
Degrees of freedom for the \(\chi^2\) reference
p_value
Bayesian p-value:
1 - chi2.cdf(mean, df)ci_lower
Lower bound of 95% credible interval (2.5%)
ci_upper
Upper bound of 95% credible interval (97.5%)
The DataFrame has
attrs["model_type"](class name) andattrs["n_draws"](total posterior draws) metadata.- Return type:¶
pandas.DataFrame
- Raises:¶
RuntimeError – If the model has not been fit yet.
See also
spatial_diagnostics_decisionModel-selection decision based on the test results.
-
spatial_diagnostics_decision(alpha=
0.05, format='graphviz')[source]¶ Return a model-selection decision from Bayesian LM test results.
Implements the decision tree from Koley and Bera [2024] (the Bayesian analogue of the classical
stge_kbprocedure in Anselin et al. [1996]), adapted for panel models following Elhorst [2014].- Parameters:¶
- alpha : float, default 0.05¶
Significance level for the Bayesian p-values.
- format : {"graphviz", "ascii", "model"}, default "graphviz"¶
Output format.
"model"returns the recommended-model name string."ascii"returns an indented box-drawing rendering of the full decision tree with the chosen path highlighted."graphviz"returns agraphviz.Digraphobject that renders inline in Jupyter; if the optionalgraphvizpackage is not installed aUserWarningis issued and the ASCII rendering is returned instead.
- Returns:¶
Recommended model name when
format="model", an ASCII tree string whenformat="ascii", or agraphviz.Digraphwhenformat="graphviz"(with ASCII fallback on missing dep).- Return type:¶
str or graphviz.Digraph
See also
spatial_diagnosticsCompute the Bayesian LM test statistics.
References
Koley and Bera [2024], Anselin et al. [1996], Elhorst [2014]
-
spatial_effects(return_posterior_samples=
False)[source]¶ Compute Bayesian inference for direct, indirect, and total impacts.
Computes impact measures for each posterior draw, then summarises the posterior distribution with means, 95% credible intervals, and Bayesian p-values.
- Parameters:¶
- Returns:¶
If return_posterior_samples is
False(default), returns a DataFrame indexed by feature names with columns for posterior means, credible-interval bounds, and Bayesian p-values.If return_posterior_samples is
True, returns(DataFrame, dict)where the dict has keys"direct","indirect","total", each mapping to a(G, k)array of posterior draws.- Return type:¶