tl; dr:
 We released PySAL 2.2 into the wild with a brand new (but backwards compatible) structure and it’s pretty great
One [meta]package to Rule Them All
A year ago, we released PySAL 2.0, which was a major refactor for the library that broke it apart
from a single monolithic package into several smaller, more focused subpackages. That decision gave
us some muchneeded flexibility and allowed us to focus development on new features without other
parts of the library slowing down new releases. After the 2.0 release, you could download any of
PySAL’s subpackages independently, but you could also grab the entire library with
conda install pysal
. From the user’s perspective, though, this created some confusion, since it
was possible to use PySAL’s functionality from either the subpackages or the large metapackage.
Worse, the documentation was in two places and things could get out of sync, where new features were
available in subpackages but not yet integrated into the metapackage. The road to a larger, more
modular package has a few bumps, but we have some ideas for helping smooth it over.
As of release 2.2, we’ve cleaned that up. Now, the pysal
package acts as a convenient container
for installing all the packages in the ecosystem. If you have legacy code that interacts with pysal,
thats fine! It will still work. But now when you install pysal you are also guaranteed to get all
the subpackages. Instead of duplicating code from across the ecosystem, pysal
specifies each of
the subpackages as a dependency. In plain terms, that means that the PySAL metapackage probably
works more like people expect: when you install pysal, it also installs all of pysal
‘s
subpackages, so you can use them from the monolithic metapackage, or a la carte. That means if you do
import segregation
or
from pysal.explore import segregation
you’re working with the exact same package. If you import the package from pysal, it’s doing nothing more than silently importing the segregation
package internally
But What does PySAL Do?
Last month I was chatting about spatial data science at a conference with some colleagues. Most of
them are R
users, so I got a few questions about what was new in pysalland. One person asked
“… what does pysal actually do?” I wasn’t expecting the question, so I got excited. But the further I got into my answer, the more I felt like Jim Carrey.
PySAL has grown to do so much that it’s difficult to encapsulate in few words. That means it’s well past time to provide a more structured overview of its functionality, if for no other reason than to give myself more of a roadmap next time I’m in the position to explain it again. If you want the decadelong background, you’re in luck! we just wrote a paper! But a quick overview of the library and what it’s for is also in order, so here goes^{1}.
The short version is: PySAL, the Python spatial analysis library, is a Python package for spatial data science. It supports the development of highlevel applications for spatial analysis, such as
 detection of spatial clusters, hotspots, and outliers
 construction of graphs from spatial data
 spatial regression and statistical modeling on geographically embedded networks
 spatial econometrics
 exploratory spatiotemporal data analysis
The longer version is: PySAL is a family of packages (currently 16) divided into four major
components: lib
, model
, explore
, and viz
. It started over a decade ago as a
collaboration between Serge Rey and
Luc Anselin, consolidating their
work on spatial econometrics and spacetime dynamics. In those days, the Python data ecosystem was
nascent (and the spatial data ecosystem was nonexistent). So the early PySAL team started by
building fundamental data structures for spatial analysis and econometrics on top of numpy
,
including things like shapefile readers, classes for building spatial weights, functions for
calculating measures like Moran’s I or estimating spatial autoregressive models, and tools for
plotting spatial data.
Fast forward 10 years and the Python data landscape has changed substantially. PySAL has too. With
things like pandas
, geopandas
, scikitlearn
, statsmodels
and a rapidly growing interactive
visualization ecosystem^{2}, we can offload
some of the nuts and bolts to our friends and instead focus on new spatial analytics, like a
new statistical measure of fit
for spatiallyconstrained cluster models, or a
computational inference framework
for comparative segregation analysis. To accommodate the growing variety of spatial analytics PySAL
supports, it’s provided as a family of packages organized around an adhoc structure.
Lib
The lib
layer provides tools to solve a wide variety of computational geometry problems including
graph construction from polygonal lattices, lines, and points, construction and interactive editing
of spatial weights matrices & graphs  computation of alpha shapes, spatial indices, and
spatialtopological relationships, and reading and writing of sparse graph data, as well as pure
python readers of spatial vector data. Unike other PySAL layers, these functions are exposed
together as a single package.

libpysal
: provides foundational algorithms and data structures that support the rest of the library. This includes the following modules: input/output (
io
), which provides readers and writers for common geospatial file formats;  weights (
weights
), which provides the main class to store spatial weights matrices, as well as several utilities to manipulate and operate on them;  computational geometry (
cg
), with several algorithms, such as Voronoi tessellations or alpha shapes that efficiently process geometric shapes;  example data sets (
examples
).
 input/output (
Explore
The explore
layer includes packages for exploratory analysis of spatial and spatiotemporal
data. These packages focus on revealing and interrogating patterns in the data and suggesting new
interesting questions rather than answering existing ones. There are also methods for examining the
dynamics of these distributions, such as how their composition or spatial extent changes over
time.

esda
: exploratory spatial data analysis and inference for global and local spatial autocorrelation 
giddy
: spacetime analysis of distribution dynamics 
inequality
: spatiotemporal inequality analysis 
pointpats
: statistical point pattern analysis 
segregation
: singlevalue and comparative segregation measurement, decomposition, and inference 
spaghetti
: spatial analysis of graphs, networks, topology, and inference.
Model
The model
layer focuses on confirmatory analysis. In particular, its packages focus on the
estimation of spatial relationships in data with a variety of linear, generalizedlinear,
generalizedadditive, nonlinear, multilevel, and local regression models.

mgwr
: single and multiscale geographicallyweighted regression modeling 
spglm
: sparse matrix generalized linear regression modeling 
spint
: gravitytype spatial interaction modeling 
spreg
: spatial econometric modeling 
spvcm
: Bayesian spatial multilevel modeling 
tobler
: areal interpolation and dasymetric mapping
Viz
The viz
layer provides functionality to support the creation of geovisualisations and visual
representations of outputs from a variety of spatial analyses. Visualization plays a central role in
modern spatial/geographic data science. Current packages provide classification methods for
choropleth mapping and a common API for linking PySAL outputs to visualization toolkits in the
Python ecosystem.

legendgram
: legends that visualize the distribution of observations by color in a given map 
mapclassify
: Choropleth map classification algorithms 
splot
: statistical visualization for spatial analysis
A Loose Translation
If you’re a spatial person coming from R
, a conversion from PySAL to R might look a bit like the following^{3}:
PySAL  R 

geopandas ^{4} 
sp / sf 
libpysal / esda 
spdep 
pointpats 
spatstat 
inequality 
ineq 
segregation 
OasisR ^{5} 
giddy 
spMarkov / spMC 
spaghetti 
tidygraph 
mgwr 
spgwr ^{6} 
spglm 
glm 
spint 
spatialPosition 
spreg 
spatialreg / splm 
spvcm 
NA 
tobler 
areal 
Spatial Data Science Gestalt
Prior to version 2.2, we combined all these packages into a single codebase distributed as the monolithic PySAL package, since spatial data science workflows necessarily include the use of several packages in tandem. But since each of the subpackages was available on its own too, people often got confused. Now when you conda install pysal
, you get them all. Most analyses begin with some exploration, first by generating a few descriptive plots using mapclassify
and splot
(and the affiliated package contextily
), before examining spatial relationships in more detail (using libpysal
to create spatial weights and esda
to analyze spatial autocorrelation). After an analyst has a more thorough understanding of the data, she might move on to build spatial autoregressive models with spreg
or geographicallyweighted regression models with mgwr
, or conduct some regional comparative analyses with inequality
or segregation
.
Alternatively, a researcher might be interested in spatiotemporal analysis, and begin by collecting some data from the census using the PySAL affiliated package cenpy
, before using tobler
to convert census geographies into timestatic units. With this new dataset in hand, the analyst can proceed to examine how a region evolves through space and time using giddy
. These kinds of workflows are common across a vast range of social and natural sciences and we hope to keep building out the PySAL family to make it easier to integrate them alongside the rest of the pydata ecosystem.
For more information on the library, check out the paper linked above or the documentation for each of the individual packages. We’re still pulling together a full suite of tutorials that will be available as the notebooks project. But until then you can check out some of the workshop materials we’ve put together (including the latest) or Serge, Dani and Levi‘s fantastic forthcoming book.

most of this is lightly repurposed from the paper :) ↩︎

see bokeh, panel, hvplot, altair, folium, ipyleaflet, etc. ↩︎

I’m not terribly picky on the capitalization or distinction between package/function here. If I got it wrong, my apologies. The goal is to provide a loose translation of functionality, not a 1:1 mapping. ↩︎

geopandas
isn’t part ofpysal
, but it’s the central infrastructure for spatial analysis in Python. ↩︎ 
OasisR
(which is fantastic) is the closest to matchingsegregation
's feature set, but PySAL’ssegregation
also includes a decomposition framework, additional computational inference methods, and street networkbased and multiscalar methods. See here for an example. ↩︎ 
as far as I know, unlike PySAL’s
gwr
,spgwr
does not implement multiscalar geographically weighted regression. ↩︎