Skip to content

bayespecon

Panel Flow Models with fit_approx

pysal/bayespecon

Panel Flow Models with `fit_approx`¶

This notebook demonstrates the panel flow model classes and focuses on fit_approx, the variational inference entry point for fast approximate posterior inference.

The examples use very small synthetic panels so the notebook executes quickly during the docs build while still exercising the full workflow for:

SARFlowPanel
SARFlowSeparablePanel
PoissonSARFlowPanel
PoissonSARFlowSeparablePanel

import warnings

import numpy as np
import pandas as pd

from bayespecon import (
    PoissonSARFlowPanel,
    PoissonSARFlowSeparablePanel,
    SARFlowPanel,
    SARFlowSeparablePanel,
)
from bayespecon.dgp.flows import (
    generate_panel_flow_data,
    generate_panel_flow_data_separable,
    generate_panel_poisson_flow_data,
    generate_panel_poisson_flow_data_separable,
)

warnings.filterwarnings("ignore", category=FutureWarning)
np.set_printoptions(precision=3, suppress=True)
pd.options.display.float_format = "{:.3f}".format

/home/runner/micromamba/envs/test/lib/python3.14/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

# Each panel DGP shares the auto-G + default `log_distance` behaviour of
# the cross-sectional flow DGPs: omit `G`/`gdf` and a synthetic point grid +
# KNN graph (`knn_k=4`) is built on the fly. The fitted graph is returned
# in the result dict so the model classes can re-use it.
T = 2
FIT_APPROX_KW = dict(draws=80, n=150, progressbar=False, random_seed=0)

gaussian_data = generate_panel_flow_data(
    n=4,
    T=T,
    rho_d=0.20,
    rho_o=0.15,
    rho_w=0.05,
    beta_d=[1.0],
    beta_o=[0.4],
    sigma=0.5,
    sigma_alpha=0.2,
    seed=10,
)
gaussian_sep_data = generate_panel_flow_data_separable(
    n=4,
    T=T,
    rho_d=0.25,
    rho_o=0.15,
    beta_d=[1.0],
    beta_o=[0.4],
    sigma=0.5,
    sigma_alpha=0.2,
    seed=11,
)
poisson_data = generate_panel_poisson_flow_data(
    n=4,
    T=T,
    rho_d=0.20,
    rho_o=0.10,
    rho_w=0.05,
    beta_d=[0.6],
    beta_o=[0.2],
    seed=12,
    k=1,
)
poisson_sep_data = generate_panel_poisson_flow_data_separable(
    n=4,
    T=T,
    rho_d=0.20,
    rho_o=0.10,
    beta_d=[0.6],
    beta_o=[0.2],
    seed=13,
    k=1,
)

# All four DGPs synthesise the same geometry from `n=4`, so any of the
# returned graphs is fine to share with the model constructors below.
G = gaussian_data["G"]

pd.DataFrame(
    [
        {
            "dataset": "Gaussian unrestricted",
            "y_shape": gaussian_data["y"].shape,
            "X_shape": gaussian_data["X"].shape,
            "features": gaussian_data["col_names"],
        },
        {
            "dataset": "Gaussian separable",
            "y_shape": gaussian_sep_data["y"].shape,
            "X_shape": gaussian_sep_data["X"].shape,
            "features": gaussian_sep_data["col_names"],
        },
        {
            "dataset": "Poisson unrestricted",
            "y_shape": poisson_data["y"].shape,
            "X_shape": poisson_data["X"].shape,
            "features": poisson_data["col_names"],
        },
        {
            "dataset": "Poisson separable",
            "y_shape": poisson_sep_data["y"].shape,
            "X_shape": poisson_sep_data["X"].shape,
            "features": poisson_sep_data["col_names"],
        },
    ]
)

	dataset	y_shape	X_shape	features
0	Gaussian unrestricted	(32,)	(32, 6)	[intercept, intra_indicator, dest_x0, orig_x0,...
1	Gaussian separable	(32,)	(32, 6)	[intercept, intra_indicator, dest_x0, orig_x0,...
2	Poisson unrestricted	(32,)	(32, 6)	[intercept, intra_indicator, dest_x0, orig_x0,...
3	Poisson separable	(32,)	(32, 6)	[intercept, intra_indicator, dest_x0, orig_x0,...

Asymmetric attributes: generate_panel_flow_data and generate_panel_poisson_flow_data now support beta_d and beta_o with different lengths (k_d ≠ k_o). The design matrix layout adjusts automatically. For example:
data = generate_panel_flow_data(n=5, T=3, beta_d=[1.0, -0.5], beta_o=[0.4], ...)

Note — lognormal default (since v0.2). generate_panel_flow_data now returns strictly-positive flows by default, drawn from \(y_t = \exp(\eta_t)\) where \(\eta_t = A^{-1}(X_t\beta + \alpha + \varepsilon_t)\) is the latent SAR-filtered linear predictor (also exposed as data['eta']). The Gaussian-likelihood panel models (SARFlowPanel, SARFlowSeparablePanel) operate on the latent scale, so we fit on np.log(data['y']). Pass distribution="normal" to recover the legacy Gaussian-on-y behaviour.
The Poisson-likelihood models are unchanged (their DGPs always emit non-negative counts).

`fit_approx` pattern¶

Each model uses the same basic workflow:

Construct the panel model.
Call fit_approx(...) with either method='advi' or method='fullrank_advi'.
Inspect model.summary(...) or model.inference_data.

In these examples, n is the number of optimization iterations used by pm.fit, while draws is the number of posterior samples drawn from the fitted variational approximation.

gaussian_model = SARFlowPanel(
    y=np.log(gaussian_data["y"]),  # latent SAR scale (lognormal default)
    G=G,
    X=gaussian_data["X"],
    T=T,
    col_names=gaussian_data["col_names"],
    model=0,
    miter=5,
    titer=50,
    trace_seed=0,
)
gaussian_idata = gaussian_model.fit_approx(method="advi", **FIT_APPROX_KW)

print(type(gaussian_model.approximation).__name__)
print(sorted(gaussian_idata.posterior.data_vars))
gaussian_model.summary(var_names=["rho_d", "rho_o", "rho_w", "sigma"], round_to=3)

MeanField
['beta', 'rho_d', 'rho_o', 'rho_simplex', 'rho_w', 'sigma']

Finished [100%]: Average Loss = 188.54
arviz - WARNING - Shape validation failed: input_shape: (1, 80), minimum_shape: (chains=2, draws=4)

	mean	sd	hdi_3%	hdi_97%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
rho_d	0.240	0.157	0.031	0.556	0.017	0.013	71.138	92.072	NaN
rho_o	0.216	0.124	0.013	0.445	0.019	0.011	46.588	71.660	NaN
rho_w	0.257	0.159	0.017	0.552	0.022	0.011	31.128	33.037	NaN
sigma	13.429	11.120	1.372	38.170	1.203	1.573	78.438	95.145	NaN

gaussian_sep_model = SARFlowSeparablePanel(
    y=np.log(gaussian_sep_data["y"]),  # latent SAR scale (lognormal default)
    G=G,
    X=gaussian_sep_data["X"],
    T=T,
    col_names=gaussian_sep_data["col_names"],
    model=0,
    trace_seed=0,
)
gaussian_sep_idata = gaussian_sep_model.fit_approx(
    method="fullrank_advi", **FIT_APPROX_KW
)

sep_identity_error = np.max(
    np.abs(
        gaussian_sep_idata.posterior["rho_w"].values
        + gaussian_sep_idata.posterior["rho_d"].values
        * gaussian_sep_idata.posterior["rho_o"].values
    )
)
print(sorted(gaussian_sep_idata.posterior.data_vars))
print(f"max |rho_w + rho_d * rho_o| = {sep_identity_error:.3e}")
gaussian_sep_model.summary(var_names=["rho_d", "rho_o", "rho_w", "sigma"], round_to=3)

['beta', 'rho_d', 'rho_o', 'rho_w', 'sigma']
max |rho_w + rho_d * rho_o| = 0.000e+00

Finished [100%]: Average Loss = 212.52
arviz - WARNING - Shape validation failed: input_shape: (1, 80), minimum_shape: (chains=2, draws=4)

	mean	sd	hdi_3%	hdi_97%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
rho_d	0.143	0.515	-0.754	0.942	0.062	0.027	65.607	57.622	NaN
rho_o	0.034	0.476	-0.809	0.676	0.066	0.021	52.964	70.031	NaN
rho_w	-0.016	0.233	-0.420	0.429	0.038	0.014	39.354	73.158	NaN
sigma	22.456	42.706	0.247	73.518	4.657	11.030	62.517	92.072	NaN

Poisson panel flow models¶

The Poisson panel variants are currently pooled-only (model=0). By default, fit_approx omits the large deterministic lambda array from the stored posterior samples, which keeps the inference data compact.

poisson_model = PoissonSARFlowPanel(
    y=poisson_data["y"],
    G=G,
    X=poisson_data["X"],
    T=T,
    col_names=poisson_data["col_names"],
    model=0,
    miter=5,
    titer=50,
    trace_seed=0,
)
poisson_idata = poisson_model.fit_approx(method="advi", **FIT_APPROX_KW)

print(sorted(poisson_idata.posterior.data_vars))
print("lambda" in poisson_idata.posterior.data_vars)
poisson_model.summary(var_names=["rho_d", "rho_o", "rho_w"], round_to=3)

['beta', 'rho_d', 'rho_o', 'rho_simplex', 'rho_w']
False

Finished [100%]: Average Loss = 2.717e+16
arviz - WARNING - Shape validation failed: input_shape: (1, 80), minimum_shape: (chains=2, draws=4)

	mean	sd	hdi_3%	hdi_97%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
rho_d	0.285	0.170	0.019	0.550	0.020	0.013	67.322	90.717	NaN
rho_o	0.237	0.141	0.053	0.502	0.019	0.010	52.821	70.031	NaN
rho_w	0.236	0.142	0.026	0.511	0.015	0.010	60.230	92.072	NaN

poisson_sep_model = PoissonSARFlowSeparablePanel(
    y=poisson_sep_data["y"],
    G=G,
    X=poisson_sep_data["X"],
    T=T,
    col_names=poisson_sep_data["col_names"],
    model=0,
    trace_seed=0,
)
poisson_sep_idata = poisson_sep_model.fit_approx(
    method="fullrank_advi", **FIT_APPROX_KW
)

comparison = pd.DataFrame(
    [
        {
            "model": "SARFlowPanel",
            "method": "advi",
            "posterior_vars": ", ".join(sorted(gaussian_idata.posterior.data_vars)),
        },
        {
            "model": "SARFlowSeparablePanel",
            "method": "fullrank_advi",
            "posterior_vars": ", ".join(sorted(gaussian_sep_idata.posterior.data_vars)),
        },
        {
            "model": "PoissonSARFlowPanel",
            "method": "advi",
            "posterior_vars": ", ".join(sorted(poisson_idata.posterior.data_vars)),
        },
        {
            "model": "PoissonSARFlowSeparablePanel",
            "method": "fullrank_advi",
            "posterior_vars": ", ".join(sorted(poisson_sep_idata.posterior.data_vars)),
        },
    ]
)

print(sorted(poisson_sep_idata.posterior.data_vars))
poisson_sep_model.summary(var_names=["rho_d", "rho_o", "rho_w"], round_to=3)

comparison

['beta', 'rho_d', 'rho_o', 'rho_w']

Finished [100%]: Average Loss = 2.6614e+19
arviz - WARNING - Shape validation failed: input_shape: (1, 80), minimum_shape: (chains=2, draws=4)

	model	method	posterior_vars
0	SARFlowPanel	advi	beta, rho_d, rho_o, rho_simplex, rho_w, sigma
1	SARFlowSeparablePanel	fullrank_advi	beta, rho_d, rho_o, rho_w, sigma
2	PoissonSARFlowPanel	advi	beta, rho_d, rho_o, rho_simplex, rho_w
3	PoissonSARFlowSeparablePanel	fullrank_advi	beta, rho_d, rho_o, rho_w