Rather than looking at the pooled data, SEDA also provides a long-form of their grade-corhort-standardized data, which is recommended for public presentation. That dataset contains observations at the district level for each year between 2009 and 2018. Using geosnap we can also examine how educational achievement has evolved over time and space. Here, we focus on math scores.
Code
ca_dists = gpd.read_parquet('data/ca_dists.parquet')seda_dists = datasets.seda(accept_eula=True, level='geodist', pooling='long', standardize='gcs')math_dists = seda_dists[seda_dists['subject']=='mth']math_dists = math_dists.groupby(['sedalea', 'year']).mean() # average over all gradesmath_dists = math_dists.dropna(subset=['gcs_mn_all'])math_dists = math_dists.reset_index()math_dists['gcs_mn_all'] = math_dists['gcs_mn_all'] - math_dists['grade'] # I think this is how you're supposed to handle averaging over gradesmath_dists = gpd.GeoDataFrame(math_dists.merge(ca_dists.drop(columns=['year']), right_on='GEOID', left_on='sedalea'))math_dists.to_parquet('data/math_dists_long.parquet')
Plotting the average math achievement over time shows a relatively steady increase, though still more than half a grade below the national average (as the line never raises above -0.5 on the Y-axis)