Using regionmask with intake

Regions from geopandas shapefiles can be pre-defined in a yaml file, which can be easily shared. This relies on intake_geopandas and accepts regionmask_kwargs, which are passed to regionmask.from_geopandas. If you set use_fsspec=True and use simplecache:: in the url, the shapefile is cached locally.

You need to install intake_geopandas, which combines geopandas and intake, see https://intake.readthedocs.io/en/latest/.

In [1]: import intake_geopandas

In [2]: import intake

# open a pre-defined remote or local catalog yaml file
In [3]: url = 'https://raw.githubusercontent.com/regionmask/regionmask/master/data/regions_remote_catalog.yaml'

In [4]: cat = intake.open_catalog(url)

# access data from remote source
In [5]: meow_regions = cat.MEOW.read()

In [6]: print(meow_regions)
<regionmask.Regions>
Name:     unnamed

Regions:
  0   r0   Region0
  1   r1   Region1
  2   r2   Region2
  3   r3   Region3
  4   r4   Region4
 ..  ...       ...
227 r227 Region227
228 r228 Region228
229 r229 Region229
230 r230 Region230
231 r231 Region231

[232 regions]

In [7]: meow_regions.plot(add_label=False)
Out[7]: <cartopy.mpl.geoaxes.GeoAxesSubplot at 0x7f4d100d0c50>
_images/plotting_MEOW.png

Find more such pre-defined regions in remote_climate_data.

Build your own catalog

To create a catalog we use the syntax described in intake. Let’s explore the Marine Ecoregions Of the World (MEOW) data set, which is a biogeographic classification of the world’s coasts and shelves.

In [8]: with open('regions_my_local_catalog.yml', 'w') as f:
   ...:     f.write("""
   ...: plugins:
   ...:   source:
   ...:     - module: intake_geopandas
   ...: sources:
   ...:   MEOW:
   ...:     description: MEOW for regionmask and cache
   ...:     driver: intake_geopandas.regionmask.RegionmaskSource
   ...:     args:
   ...:       urlpath: simplecache::http://maps.tnc.org/files/shp/MEOW-TNC.zip
   ...:       use_fsspec: true  # optional for caching
   ...:       storage_options:  # optional for caching
   ...:         simplecache:
   ...:           same_names: true
   ...:           cache_storage: cache
   ...:       regionmask_kwargs:
   ...:         names: ECOREGION
   ...:         abbrevs: _from_name
   ...:         source: http://maps.tnc.org
   ...:         numbers: ECO_CODE_X
   ...:         name: MEOW
   ...: """)
   ...: 
In [9]: cat = intake.open_catalog('regions_my_local_catalog.yml')

In [10]: meow_regions = cat.MEOW.read()

In [11]: print(meow_regions)
<regionmask.Regions>
Name:     MEOW
Source:   http://maps.tnc.org

Regions:
  1.0          NorGre                          North Greenland
  2.0    NorandEasIce                   North and East Iceland
  3.0       EasGreShe                     East Greenland Shelf
  4.0       WesGreShe                     West Greenland Shelf
  5.0 NorGraBanSouLab Northern Grand Banks - Southern Labrador
  ...             ...                                      ...
228.0       AmuBelSea              Amundsen/Bellingshausen Sea
229.0          RosSea                                 Ross Sea
230.0    BouandAntIsl             Bounty and Antipodes Islands
231.0          CamIsl                          Campbell Island
232.0          AucIsl                          Auckland Island

[232 regions]

Because simplecache:: was added to the urlpath and use_fsspec=True, the zip file was downloaded to the folder specified in cache_storage. The file access is now local.

In [12]: import os

In [13]: assert os.path.exists('cache/MEOW-TNC.zip')