geodatasets#

Fetch links or download and cache spatial data example files.

The geodatasets contains an API on top of a JSON with metadata of externally hosted datasets containing geospatial information useful for illustrative and educational purposes.

Install#

From PyPI:

pip install geodatasets

or using conda or mamba from conda-forge:

conda install geodatasets -c conda-forge

The development version can be installed using pip from GitHub.

pip install git+https://github.com/geopandas/geodatasets.git

How to use#

The package comes with a database of datasets. To see all:

In [1]: import geodatasets

In [2]: geodatasets.data
Out[2]:
{'geoda': {'airbnb': {'url': 'https://geodacenter.github.io/data-and-lab//data/airbnb.zip',
   'license': 'CC-0',
   'attribution': 'GeoDa Data and Lab',
   'name': 'geoda.airbnb',
   'description': 'Airbnb rentals, socioeconomics, and crime in Chicago',
   'nrows': 77,
   'ncols': 20,
   'details': 'https://geodacenter.github.io/data-and-lab//airbnb/',
   'hash': 'a2ab1e3f938226d287dd76cde18c00e2d3a260640dd826da7131827d9e76c824',
   'filename': 'airbnb.zip'},
  'atlanta': {'url': 'https://geodacenter.github.io/data-and-lab//data/atlanta_hom.zip',
   'license': 'CC-0',
   'attribution': 'GeoDa Data and Lab',
   'name': 'geoda.atlanta',
   'description': 'Atlanta, GA region homicide counts and rates',
   'nrows': 90,
   'ncols': 23,
   'details': 'https://geodacenter.github.io/data-and-lab//atlanta_old/',
   'hash': 'missing',
   'filename': 'atlanta_hom.zip'},
   ...

There is also convenient top-level API. One to get only the URL:

In [3]: geodatasets.get_url("geoda airbnb")
Out[3]: 'https://geodacenter.github.io/data-and-lab//data/airbnb.zip'

And one to get the local path. If the file is not available in the cache, it will be downloaded first.

Out[4]: '/Users/martin/Library/Caches/geodatasets/airbnb.zip'
In [4]: geodatasets.get_path('geoda airbnb')

You can also get all the details:

In [5]: geodatasets.data.geoda.airbnb
Out[5]:
{'url': 'https://geodacenter.github.io/data-and-lab//data/airbnb.zip',
 'license': 'CC-0',
 'attribution': 'GeoDa Data and Lab',
 'name': 'geoda.airbnb',
 'description': 'Airbnb rentals, socioeconomics, and crime in Chicago',
 'nrows': 77,
 'ncols': 20,
 'details': 'https://geodacenter.github.io/data-and-lab//airbnb/',
 'hash': 'a2ab1e3f938226d287dd76cde18c00e2d3a260640dd826da7131827d9e76c824',
 'filename': 'airbnb.zip'}

Or using the name query:

In [6]: geodatasets.data.query_name('geoda airbnb')
Out[6]:
{'url': 'https://geodacenter.github.io/data-and-lab//data/airbnb.zip',
 'license': 'CC-0',
 'attribution': 'GeoDa Data and Lab',
 'name': 'geoda.airbnb',
 'description': 'Airbnb rentals, socioeconomics, and crime in Chicago',
 'nrows': 77,
 'ncols': 20,
 'details': 'https://geodacenter.github.io/data-and-lab//airbnb/',
 'hash': 'a2ab1e3f938226d287dd76cde18c00e2d3a260640dd826da7131827d9e76c824',
 'filename': 'airbnb.zip'}