bayespecon.dgp.generate_poisson_flow_data¶
-
bayespecon.dgp.generate_poisson_flow_data(n=
None, k=2, k_d=None, k_o=None, rho_d=0.3, rho_o=0.2, rho_w=0.1, beta_d=None, beta_o=None, gamma_dist=-0.5, seed=42, G=None, err_hetero=False, gdf=None, knn_k=4)[source]¶ Generate synthetic origin-destination flow count data for a Poisson spatial autoregressive flow model.
The data-generating process follows:
\[\eta = A(\rho_d, \rho_o, \rho_w)^{-1} X\beta, \qquad y_{ij} \sim \operatorname{Poisson}(\exp(\eta_{ij}))\]where the system matrix is
\[A = I_N - \rho_d (I_n \otimes W) - \rho_o (W \otimes I_n) - \rho_w (W \otimes W), \quad N = n^2\]and \(W\) is the row-standardised spatial weight matrix.
- Parameters:¶
- n : int, default 10¶
Approximate number of spatial units. When neither G nor gdf is provided, a rook-contiguity grid with round(sqrt(n)) units per side is created, yielding approximately n units. Total number of flows is N = n_actual^2. When G is provided, n must match the number of units in G.
- k : int, default 2¶
Number of destination/origin attribute columns when
k_dandk_oare not specified (excluding intercepts added internally). Ignored whenk_dand/ork_oare provided, or whenbeta_d/beta_oare lists whose length determinesk_d/k_o.- k_d : int or None, default None¶
Number of destination-side attribute columns. Overrides
kfor the destination side when provided.- k_o : int or None, default None¶
Number of origin-side attribute columns. Overrides
kfor the origin side when provided.- rho_d : float, default 0.3¶
Destination autocorrelation parameter.
- rho_o : float, default 0.2¶
Origin autocorrelation parameter.
- rho_w : float, default 0.1¶
Network autocorrelation parameter.
- beta_d : float or list of float or None, default None¶
Destination-side coefficients for the k attributes. A scalar broadcasts to all columns. Defaults to
1.0for all columns.- beta_o : float or list of float or None, default None¶
Origin-side coefficients. Defaults to
1.0for all columns.- seed : int, default 42¶
Seed for
numpy.random.default_rng.- G : libpysal.graph.Graph or None, default None¶
Row-standardised spatial graph on n units. If
None, a rook-contiguity graph on a regular grid is constructed automatically viaresolve_weights().- err_hetero : bool, default False¶
Accepted for API parity with other DGP functions; ignored for the Poisson model (the variance is determined by the mean).
- gdf : GeoDataFrame or None, default None¶
Accepted for API parity; ignored (use G instead).
- Returns:¶
y_vecnp.ndarray, shape (N,), dtype int64Flattened count observations.
y_matnp.ndarray, shape (n, n), dtype int64Count observations reshaped as an O×D matrix.
eta_vecnp.ndarray, shape (N,)Log-mean (spatially filtered linear predictor).
lambda_vecnp.ndarray, shape (N,)Poisson means (\(\exp(\eta_{ij})\)).
Xdnp.ndarray, shape (n, k)Destination-side regional attribute matrix.
Xdnp.ndarray, shape (n, k)Destination-side regional attribute matrix.
Xonp.ndarray, shape (n, k)Origin-side regional attribute matrix.
Xnp.ndarray, shape (N, p)Full O-D design matrix (for model fitting).
designFlowDesignMatrixFull O-D design matrix (for downstream inspection).
Wnp.ndarray, shape (n, n)Dense row-standardised weight matrix.
Glibpysal.graph.GraphSpatial graph.
rho_d,rho_o,rho_wTrue autocorrelation parameters.
beta_d,beta_oTrue coefficient vectors.
- Return type:¶
dict with keys
- Raises:¶
np.linalg.LinAlgError – If the system matrix \(A\) is singular (usually because
rho_d + rho_o + rho_w >= 1).
Examples
>>> from bayespecon.dgp import generate_poisson_flow_data >>> data = generate_poisson_flow_data(n=9, seed=0) >>> data["y_mat"].dtype dtype('int64') >>> data["lambda_vec"].shape (81,) >>> data["Xd"].shape (9, 2) >>> data["Xo"].shape (9, 2)