bayespecon.dgp.simulate_ols

bayespecon.dgp.simulate_ols(n=None, W=None, gdf=None, beta=None, sigma=1.0, err_hetero=False, rng=None, seed=None, contiguity='queen', create_gdf=False, geometry_type='polygon')[source]

Simulate data from a non-spatial OLS DGP y = X beta + eps.

Generates a random design matrix with an intercept and len(beta) - 1 continuous regressors, and draws the response from a homoskedastic Normal error model. No spatial weights matrix is required or produced; this function is the natural complement to the spatial DGPs for use as a non-spatial baseline.

Parameters:
n : int, optional

Number of observations when neither W nor gdf is provided.

W : Graph or sparse/dense matrix, optional

Spatial weights input used only to infer n and validate dimensions. Not used in the OLS data-generating mechanism.

gdf : geopandas.GeoDataFrame, optional

Spatial units source used only to infer n when W is not provided.

beta : array-like, optional

Coefficient vector including intercept. Defaults to [1.0, 2.0] (intercept = 1, one regressor with slope = 2).

sigma : float, default=1.0

Innovation standard deviation \(\sigma\).

err_hetero : bool, default=False

If True, generate heteroskedastic innovations with observation-specific standard deviations \(\sigma_i = \sigma \sqrt{1 + \|x_i\|^2}\).

rng : numpy.random.Generator, optional

Random generator instance for reproducibility.

seed : int, optional

Integer seed used when rng is not supplied.

contiguity : str, default="queen"

Neighbor rule used when inferring n from gdf.

create_gdf : bool, default=False

If True, attaches a GeoDataFrame with y and X_* columns to geometry generated on an n-unit grid.

geometry_type : {"point", "polygon"}, default="polygon"

Geometry type to generate when create_gdf=True.

Returns:

Keys:

  • y : np.ndarray of shape (n,) — response variable.

  • X : np.ndarray of shape (n, k) — design matrix with intercept in the first column.

  • params_true : dict with beta and sigma.

  • gdf : GeoDataFrame (only present when create_gdf=True).

Return type:

dict