25  Ecometrics and CFA

Sampson and Raudenbush outline a theory of “ecometrics” designed to capture the social-ecological context as a way to help measure the latent construct of collective efficacy. A key to the original methodology is that it relies on actual observations of social interaction as measured by an intentionally-designed survey. These data are combined with others gathered via systematic social observation.

This idea has been expanded to include new forms of data like google street maps and VGI (boston stuff). We usually don’t have SSO data, but we do have lots of other data like 411 reports, google street view, and satellite imagery and we may be able to substitute these data for SSO–if we believe they accurately capture the social process under investigation (i.e. if we believe that some social process like “disorder” is the underlying driver of the observed data).

The key distinction between ecometrics and earlier methods like factor ecology is the reliance on a formal theoretical model underneath. We’re not allowing the data to speak for themselves; instead we are specifying a set of theoretical social processes which are unobservable directly, but might be inferred to exist if we treat them as latent variables. We then fit a model and test whether these latent constructs appear as specified.

Using the ecometric framework, we can try and capture the geography of opportunity following the theory outlined by Galster, who argues that individual-level outcomes are a function of individual characteristics, as well as spatial characteristics (at multiple scales).

\[O_{it} = \alpha + \beta[P_{it}] + \gamma[P_i] + \varphi[UP_{it}] + \delta[UP_i] + \theta[N_{jt}] + \mu[M_{kt}] + \epsilon\]


To capture the geography of opportunity, then, our focus is on the \(M\) and \(N\) components of the equation

acccording to Galster (2013), the key vectors of these terms are composed of four categories: - social-interactive, - environmental, - geographic, and - institutional.

These categories are well supported by the empirical literature, and are theoretically grounded in causal processes that generate socioeconomic outcomes.

Following Knaap,

One way to address this problem is to treat the quantification of opportunity as a measurement error problem. Through a liberal interpretation, this may be viewed as an extension of ecometrics, a methodology concerned with developing measures of neighborhood social ecology (raudenbush1999ecometrics?; Mujahid:AmJEpidemiol:2007?; OBrien2013?). In this framework, opportunity and its subdimensions are viewed as latent variables that cannot be measured directly, but can be estimated by modeling the covariation among the indicators through which they manifest. As with any measurement model, however, opportunity metrics require a sound theoretical framework for organizing and specifying relationships among variables. As described above, a major weakness of opportunity analyses to date has been the lack of a sound framework for organizing indicators into categories of metrics. To address this issue, I argue that the literature on neighborhood effects offers a sound organizing framework for classifying subdimensions of opportunity. Specifically, I propose that neighborhood indicators should be categorized according to the four mechanisms of neighborhood effects outlined by Galster (2013): social-interactive, environmental, geographic, and institutional. These categories are well supported by the empirical literature, and are theoretically grounded in causal processes that generate socioeconomic outcomes.

from factor_analyzer import (ConfirmatoryFactorAnalyzer, ModelSpecificationParser)
from geosnap import DataStore
from geosnap import analyze as gaz
from geosnap import visualize as gvz
from geosnap import io as gio
datasets = DataStore()
'/Users/knaaptime/Library/Application Support/geosnap'
balt = gio.get_acs(datasets, msa_fips='12580', years=[2020], level='bg')
Signature: gio.get_nces(datastore, years='1516', dataset='sabs')
Extract a subset of data from the National Center for Educational Statistics as a long-form geodataframe.
datastore : geosnap.DataStore
    an instantiated DataStore object
years : str, optional
    set of academic years to return formatted as a 4-digit string representing the two years
    from a single period of the academic calendar. For example, the 2015-2016 academic year
    is represented as "1516". Defaults to "1516"
dataset : str, optional
    which NCES dataset to query. Options include `sabs`, `districts`, or `schools`
    Defaults to 'sabs'
    long-form geodataframe with 'year' column representing each time period
File:      ~/mambaforge/envs/urban_analysis/lib/python3.10/site-packages/geosnap/io/constructors.py
Type:      function
sabs = gio.get_nces(datasets)
sabs = sabs[sabs.intersects(balt.to_crs(sabs.crs).unary_union)]

seda = gio.get_seda(accept_eula=True)
seda = datasets.seda(accept_eula=True)
sabs = sabs.merge(seda, left_on='ncessch', right_on='sedasch')
