_images/logo.png

create masks of geographical regions

regionmask is a Python module that:

Usage

Please have a look at the tutorials.

Contents

What’s New

v0.3.1 (4 October 2016)

This is a bugfix/ cleanup release.

Bug Fixes
  • travis was configured wrong - it always tested on python 2.7, thus some python3 issues went unnoticed (:issue:`14`).
  • natural_earth was not properly imported (:issue:`10`).
  • A numpy scalar of dtype integer is not int - i.e. isinstance(np.int32, int) is False (:issue:`11`).
  • In python 3 zip is an iterator (and not a list), thus it failed on mask (:issue:`15`).
  • Removed unnecessary files (ne_downloader.py and naturalearth.py).
  • Resolved conflicting region outlines in the Giorgi regions (:issue:`17`).

v0.3.0 (20 September 2016)

  • Allow passing 2 dimensional latitude and longitude grids (:issue:`8`).

v0.2.0 (5 September 2016)

  • Add name for xarray mask (:issue:`3`). By Mathias Hauser.
  • overhaul of the documentation
  • move rtd / matplotlib handling to background

v0.1.0 (15 August 2016)

  • first release on pypi

Installation

Required dependencies

For plotting on geographical maps:

To open Natural Earth datasets (shapefiles):

Optional dependencies

To output xarray data sets:

Instructions

regionmask itself is a pure Python package, but its dependencies are not. The easiest way to get them installed is to use conda.

conda install numpy cartopy xarray

If you don’t use conda, be sure you have the required dependencies. You can then install regionmask via pip (it is not (yet) available on conda):

pip install regionmask

To run the test suite after installing xarray, install py.test and run py.test regionmask.

To install the development version (master), do:

pip install git+https://github.com/mathause/regionmask

Countries/ States (NaturalEarth)

The outline of the countries are obtained from Natural Earth. They are automatically downloaded on-the-fly (but only once) with cartopy and opened with geopandas. The following countries and regions are defined in regionmask.

  • Countries 1:110m
  • Countries 1:50m
  • US States 1:50m
  • US States 1:10m

Note

A mask obtained with a fine resolution dataset is not necessarily better. Always check your mask!

The following imports are necessary for the examples.

In [1]: import regionmask

In [2]: import matplotlib.pyplot as plt

Countries

In [3]: regionmask.defined_regions.natural_earth.countries_110.plot(add_label=False);

In [4]: plt.tight_layout()
_images/plotting_countries.png

US States

In [5]: states = regionmask.defined_regions.natural_earth.us_states_50

In [6]: states.plot(add_label=False);

In [7]: plt.tight_layout()
_images/plotting_states.png

Also create a mask for a 1° grid over the US:

In [8]: import numpy as np

# create a grid
In [9]: lon = np.arange(200.5, 325)

In [10]: lat = np.arange(74.5, 14, -1)

In [11]: mask = states.mask(lon, lat, wrap_lon=True)

In [12]: mask_ma = np.ma.masked_invalid(mask)

In [13]: states.plot(add_label=False);

In [14]: LON_EDGE = np.arange(200, 326)

In [15]: LAT_EDGE = np.arange(75, 13, -1)

In [16]: plt.pcolormesh(LON_EDGE, LAT_EDGE, mask_ma, cmap='viridis');

In [17]: plt.tight_layout()
_images/plotting_states_mask.png

‘Scientific’ Regions

The following regions, used in the scientific literature, are already defined:

  • Giorgi Regions (from Giorgi and Franciso, 2000)
  • SREX Regions (Special Report on Managing the Risks of Extreme Events and Disasters to Advance Climate Change Adaptation (SREX) from Seneviratne et al., 2012)

The following imports are necessary for the examples.

In [1]: import regionmask

In [2]: import matplotlib.pyplot as plt

Giorgi Regions

In [3]: regionmask.defined_regions.giorgi.plot(label='abbrev');

In [4]: plt.tight_layout()
_images/plotting_giorgi.png

SREX Regions

In [5]: regionmask.defined_regions.srex.plot();

In [6]: plt.tight_layout()
_images/plotting_srex.png

Note

This tutorial was generated from an IPython notebook that can be downloaded here.

Plotting

Every region has a plotting function, that draws the outline of all (or selected) regions on a cartopy map. We use the predefined SREX regions as example.

Import regionmask and check the version:

import regionmask
regionmask.__version__
'0.3.1'

Plot all regions

Do the default plot.

regionmask.defined_regions.srex.plot();
_images/plotting_5_0.png

Plot only a Subset of Regions

# load cartopy
import cartopy.crs as ccrs

# regions can be selected by number, abbreviation or long name
regions=[11, 'CEU', 'S. Europe/Mediterranean']

# choose a good projection for regional maps
proj=ccrs.LambertConformal(central_longitude=15)

# do the plot
ax = regionmask.defined_regions.srex.plot(regions=regions, add_ocean=False, resolution='50m',
                                          proj=proj, label='abbrev')

# fine tune the extent
ax.set_extent([-15, 45, 28, 75], crs=ccrs.PlateCarree())
_images/plotting_7_0.png

Note

This tutorial was generated from an IPython notebook that can be downloaded here.

Create numpy region mask

In this tutorial we will show how to create a mask for arbitrary latitude and longitude grids.

Import regionmask and check the version:

import regionmask
regionmask.__version__
'0.3.1'

We define a lon/ lat grid with a 1° grid spacing, where the points define the middle of the grid. Additionally we create a grid that spans the edges of the grid for the plotting.

import numpy as np

# define a longitude latitude grid
lon = np.arange(-179.5, 180)
lat = np.arange(-89.5, 90)

# for the plotting
lon_edges = np.arange(-180, 181)
lat_edges = np.arange(-90, 91)

Again we use the SREX regions. Using xarray=False tells the code to output to a numpy array.

mask = regionmask.defined_regions.srex.mask(lon, lat, xarray=False)

mask is now a n_lon x n_lat numpy array. Gridpoints that do not fall in a region are NaN, the gridpoints that fall in a region are encoded with the number of the region (here 1 to 26).

The function mask determines if all cominations of points given in lon and lat lies within the polygon making up the region.

We can now plot the mask:

import cartopy.crs as ccrs
import matplotlib.pyplot as plt

ax = plt.subplot(111, projection=ccrs.PlateCarree())
# pcolormesh does not handle NaNs, requires masked array
mask_ma = np.ma.masked_invalid(mask)

h = ax.pcolormesh(lon_edges, lat_edges, mask_ma, transform=ccrs.PlateCarree(), cmap='viridis')

ax.coastlines()

plt.colorbar(h, orientation='horizontal', pad=0.04);
_images/mask_numpy_9_0.png

Finally the mask can now be used to mask out all data that is not in a specific region.

# create random data
data = np.random.randn(*lat.shape + lon.shape)

# only retain data in the Central Europe
data_ceu = np.ma.masked_where(mask != 12, data)

Plot the selected data

# load cartopy
import cartopy.crs as ccrs

# choose a good projection for regional maps
proj=ccrs.LambertConformal(central_longitude=15)

# plot the outline of the central European region
ax = regionmask.defined_regions.srex.plot(regions=12, add_ocean=False, resolution='50m',
                          proj=proj, add_label=False)

ax.pcolormesh(lon_edges, lat_edges, data_ceu, transform=ccrs.PlateCarree())

# fine tune the extent
ax.set_extent([-15, 45, 40, 65], crs=ccrs.PlateCarree())
_images/mask_numpy_13_0.png

Finally we can obtain the region mean:

print('Global mean:   ', np.mean(data))
print('Central Europe:', np.mean(data_ceu))
Global mean:    0.00455457297046
Central Europe: -0.0032847776509

Create a mask with a different lon/ lat grid

The interesting thing of gridmask is that you can use any lon/ lat grid.

Use a 5° x 5° grid:

# define a longitude latitude grid
lon5 = np.arange(-177.5, 180, 5)
lat5 = np.arange(-87.5, 90, 5)

# for the plotting
lon5_edges = np.arange(-180, 181, 5)
lat5_edges = np.arange(-90, 91, 5)

mask5_deg = regionmask.defined_regions.srex.mask(lon5, lat5, xarray=False)
ax = plt.subplot(111, projection=ccrs.PlateCarree())
# pcolormesh does not handle NaNs, requires masked array
mask5_ma = np.ma.masked_invalid(mask5_deg)

h = ax.pcolormesh(lon5_edges, lat5_edges, mask5_ma, transform=ccrs.PlateCarree(), cmap='viridis')

ax.coastlines()

plt.colorbar(h, orientation='horizontal', pad=0.04);
_images/mask_numpy_19_0.png

Now the grid cells are much larger.

Note

This tutorial was generated from an IPython notebook that can be downloaded here.

Create xarray region mask

In this tutorial we will show how to create a mask for arbitrary latitude and longitude grids using xarray. It is very similar to the tutorial Create Mask (numpy).

Import regionmask and check the version:

import regionmask
regionmask.__version__
'0.3.1'

Load xarray and the tutorial data:

import xarray as xr
import numpy as np
airtemps = xr.tutorial.load_dataset('air_temperature')

The example data is a temperature field over North America. Let’s plot the first time step:

# load plotting libraries
import matplotlib.pyplot as plt
import cartopy.crs as ccrs

# choose a good projection for regional maps
proj=ccrs.LambertConformal(central_longitude=-100)

ax = plt.subplot(111, projection=proj)

airtemps.isel(time=1).air.plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree())

ax.coastlines();
_images/mask_xarray_8_0.png

Conviniently we can directly pass an xarray object to the mask function. It gets the longitude and latitude from the DataArray/ Dataset and creates the mask. If the longituda and latitude in the xarray object are not called lon and lat, respectively; their name can be given via the lon_name and lat_name keyword. Here we use the Giorgi regions.

mask = regionmask.defined_regions.giorgi.mask(airtemps)
print('All NaN? ',np.all(np.isnan(mask)))
All elements of mask are NaN. Try to set 'wrap_lon=True'.
All NaN?  True

This didn’t work - all elements are NaNs! The reason is that airtemps has its longitude from 0 to 360 while the Giorgi regions are defined as -180 to 180. Thus we can provide the wrap_lon keyword:

mask = regionmask.defined_regions.giorgi.mask(airtemps, wrap_lon=True)
print('All NaN? ',np.all(np.isnan(mask)))
All NaN?  False

This is better. Let’s plot the regions:

proj=ccrs.LambertConformal(central_longitude=-100)
ax = plt.subplot(111, projection=proj)

low = mask.min()
high = mask.max()

levels = np.arange(low - 0.5, high + 1)

mask.plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree(), levels=levels, add_colorbar=False)

ax.coastlines()

# fine tune the extent
ax.set_extent([200, 330, 10, 75], crs=ccrs.PlateCarree());
_images/mask_xarray_14_0.png

We want to select the region ‘Central North America’. Thus we first need to find out which number this is:

regionmask.defined_regions.giorgi.map_keys('Central North America')
6

Select using where

xarray provides the handy where function:

airtemps_CNA = airtemps.where(mask == 6)

Check everything went well by repeating the first plot with the selected region:

# choose a good projection for regional maps
proj=ccrs.LambertConformal(central_longitude=-100)

ax = plt.subplot(111, projection=proj)

airtemps_CNA.isel(time=1).air.plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree())

ax.coastlines();
_images/mask_xarray_21_0.png

Looks good - let’s take the area average and plot the time series. (Note: you should use cos(lat) weights to correctly calculate an area average. Unfortunately this is not yet (as of version 0.8) implemented in xarray.)

ts_airtemps_CNA = airtemps_CNA.mean(dim=('lat', 'lon')) - 273.15
ts_airtemps = airtemps.mean(dim=('lat', 'lon')) - 273.15

# and the line plot
ts_airtemps_CNA.air.plot.line(label='Central North America')
ts_airtemps.air.plot(label='Entire Domain')

plt.legend(ncol=2);
_images/mask_xarray_23_0.png

Select using groupby

# xarray version > 0.8 is required
xr.__version__
'0.8.2'
# mask needs a name
mask.name = 'region'
# you can group over all integer values of the mask
# you have to take the mean over `stacked_lat_lon`
airtemps_all = airtemps.groupby(mask).mean('stacked_lat_lon')
airtemps_all
<xarray.Dataset>
Dimensions:  (region: 6, time: 2920)
Coordinates:
  * time     (time) datetime64[ns] 2013-01-01 2013-01-01T06:00:00 ...
  * region   (region) float64 4.0 5.0 6.0 7.0 8.0 9.0
Data variables:
    air      (region, time) float64 293.6 292.2 291.4 293.6 293.2 291.1 ...

we can add the abbreviations and names of the regions to the DataArray

# extract the abbreviations and the names of the regions from regionmask
abbrevs = regionmask.defined_regions.giorgi[airtemps_all.region.values].abbrevs
names = regionmask.defined_regions.giorgi[airtemps_all.region.values].names

airtemps_all.coords['abbrevs'] = ('region', abbrevs)
airtemps_all.coords['names'] = ('region', names)
airtemps_all
<xarray.Dataset>
Dimensions:  (region: 6, time: 2920)
Coordinates:
  * time     (time) datetime64[ns] 2013-01-01 2013-01-01T06:00:00 ...
  * region   (region) float64 4.0 5.0 6.0 7.0 8.0 9.0
    abbrevs  (region) |S3 'CAM' 'WNA' 'CNA' 'ENA' 'ALA' 'GRL'
    names    (region) |S21 'Central America' 'Western North America' ...
Data variables:
    air      (region, time) float64 293.6 292.2 291.4 293.6 293.2 291.1 ...

now we can select the regions in many ways

f, axes = plt.subplots(3, 1)

# as before, by the index of the region
ax = axes[0]
airtemps_all.sel(region=6).air.plot(ax=ax)

# with the abbreviation
ax = axes[1]
airtemps_all.isel(region=(airtemps_all.abbrevs == 'WNA')).air.plot(ax=ax)

# with the long name
ax = axes[2]
airtemps_all.isel(region=(airtemps_all.names == 'Eastern North America')).air.plot(ax=ax)

f.tight_layout()
_images/mask_xarray_31_0.png

Note

This tutorial was generated from an IPython notebook that can be downloaded here.

Multidimensional coordinates

Multidimensional coordinates with xarray

Regionmask can also handle mutltidimensional longitude/ latitude grids (e.g. from a regional climate model). As xarray provides such an example dataset, we will use it to illustrate it. See also in the xarray documentation.

Import regionmask and check the version:

import regionmask
regionmask.__version__
'0.3.1'

Load xarray and the tutorial data:

import xarray as xr
import numpy as np
rasm = xr.tutorial.load_dataset('rasm')

The example data is a temperature field over the Northern Hemisphere. Let’s plot the first time step:

# load plotting libraries
import matplotlib.pyplot as plt
import cartopy.crs as ccrs

# choose a projection
proj=ccrs.NorthPolarStereo()

ax = plt.subplot(111, projection=proj)
ax.set_global()

rasm.isel(time=1).Tair.plot.pcolormesh(ax=ax, x='xc', y='yc', transform=ccrs.PlateCarree())

# add the abbreviation of the regions
regionmask.defined_regions.srex.plot(ax=ax, regions=[1, 2, 11, 12, 18],
                                     add_ocean=False, coastlines=False, label='abbrev')

ax.set_extent([-180, 180, 50, 90], ccrs.PlateCarree())

ax.coastlines();
_images/mask_multidim_9_0.png

Again we pass the xarray object to regionmask. We have to specify xc and yc as the longitude and latitude coordinates of the array:

mask = regionmask.defined_regions.srex.mask(rasm, lon_name='xc', lat_name='yc', wrap_lon=True)
mask
<xarray.DataArray (y: 205, x: 275)>
array([[ nan,  nan,  nan, ...,   5.,   5.,   5.],
       [ nan,  nan,  nan, ...,   5.,   5.,   5.],
       [ nan,  nan,  nan, ...,   5.,   5.,   5.],
       ...,
       [ 24.,  24.,  24., ...,  14.,  14.,  14.],
       [ 24.,  24.,  24., ...,  14.,  14.,  14.],
       [ 24.,  24.,  24., ...,  14.,  14.,  14.]])
Coordinates:
  * y        (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ...
  * x        (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ...
    yc       (y, x) float64 16.53 16.78 17.02 17.27 17.51 17.76 18.0 18.25 ...
    xc       (y, x) float64 189.2 189.4 189.6 189.7 189.9 190.1 190.2 190.4 ...

We want to select the region ‘NAS’ (Northern Asia).

Select using where

We have to select by index (the number of the region), we thus map from the abbreviation to the index.

rasm_NAS = rasm.where(mask == regionmask.defined_regions.srex.map_keys('NAS'))

Check everything went well by repeating the first plot with the selected region:

# choose a projection
proj=ccrs.NorthPolarStereo()

ax = plt.subplot(111, projection=proj)
ax.set_global()

rasm_NAS.isel(time=1).Tair.plot.pcolormesh(ax=ax, x='xc', y='yc', transform=ccrs.PlateCarree())


# add the abbreviation of the regions
regionmask.defined_regions.srex.plot(ax=ax, regions=['NAS'],
                                     add_ocean=False, coastlines=False, label='abbrev')

ax.set_extent([-180, 180, 50, 90], ccrs.PlateCarree())

ax.coastlines();
_images/mask_multidim_17_0.png

Looks good - now we coould again take the area average.

Multidimensional coordinates with numpy

If you don’t use xarray, the same can be achieved with numpy arrays.

# extract lon and lat
lon = rasm.xc.values
lat = rasm.yc.values

The lon and lat numpy arrays can directly be passed to regionmask (now we create a mask with the countries of the world):

mask = regionmask.defined_regions.natural_earth.countries_110.mask(lon, lat, wrap_lon=True)
mask
<xarray.DataArray (lon_idx: 205, lat_idx: 275)>
array([[ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       ...,
       [ nan,  nan,  nan, ...,  93.,  93.,  93.],
       [ nan,  nan,  nan, ...,  93.,  93.,  93.],
       [ nan,  nan,  nan, ...,  93.,  93.,  93.]])
Coordinates:
  * lat_idx  (lat_idx) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ...
    lat      (lon_idx, lat_idx) float64 16.53 16.78 17.02 17.27 17.51 17.76 ...
  * lon_idx  (lon_idx) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ...
    lon      (lon_idx, lat_idx) float64 189.2 189.4 189.6 189.7 189.9 190.1 ...

If xarray is installed, regionmask automatically returns a xarray object even when passing numpy arrays. To get a numpy array, use the xarray=False keyword.

# mask = regionmask.defined_regions.natural_earth.countries_110.mask(lon, lat, xarray=False, wrap_lon=True)

Or just use mask.values:

mask = mask.values
mask
array([[ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       ...,
       [ nan,  nan,  nan, ...,  93.,  93.,  93.],
       [ nan,  nan,  nan, ...,  93.,  93.,  93.],
       [ nan,  nan,  nan, ...,  93.,  93.,  93.]])

Finally, plot the mask:

proj=ccrs.NorthPolarStereo()

ax = plt.subplot(111, projection=proj)

m = np.ma.masked_invalid(mask)
ax.pcolormesh(lon, lat, m, transform=ccrs.PlateCarree())

ax.set_extent([-180, 180, 45, 90], ccrs.PlateCarree())

ax.coastlines();
_images/mask_multidim_29_0.png

Note

This tutorial was generated from an IPython notebook that can be downloaded here.

Create your own region

Creating own regions is straightforward.

Import regionmask and check the version:

import regionmask
regionmask.__version__
'0.3.1'

Assume you have two custom regions in the US.

US1 = [[-100., 30], [-100, 40], [-120, 35]]
US2 = [[-100., 30], [-80, 30], [-80, 40], [-100, 40]]

You also need to provide numbers, names and abbreviations:

numbers = [0, 1]
names = ['US_west', 'US_east']
abbrevs = ['USw', 'USe']

USmask = regionmask.Regions_cls('USmask', numbers, names, abbrevs, [US1, US2])

Again we can plot the outline of the defined regions

ax = USmask.plot() #(label='abbrev')

# load cartopy
import cartopy.crs as ccrs
# fine tune the extent
ax.set_extent([225, 300, 25, 45], crs=ccrs.PlateCarree());
_images/create_own_regions_9_0.png

and obtain a mask:

import numpy as np

# define lat/ lon grid
lon = np.arange(200.5, 330)
lat = np.arange(74.5, 15, -1)

# for the plotting
lon_edges = np.arange(200, 330)
lat_edges = np.arange(74, 14, -1)

mask = USmask.mask(lon, lat, wrap_lon=True)
import matplotlib.pyplot as plt

ax = plt.subplot(111, projection=ccrs.PlateCarree())
# pcolormesh does not handle NaNs, requires masked array
mask_ma = np.ma.masked_invalid(mask)

h = ax.pcolormesh(lon_edges, lat_edges, mask_ma, transform=ccrs.PlateCarree(), cmap='viridis')

ax.coastlines()

plt.colorbar(h, orientation='horizontal', pad=0.04);

ax.set_extent([225, 300, 25, 45], crs=ccrs.PlateCarree());
_images/create_own_regions_12_0.png

Use shapely Polygon

If you have the region defined as a shapely polygon, this also works:

from shapely.geometry import Polygon

US1_poly = Polygon(US1)
US2_poly = Polygon(US2)
USmask_poly = regionmask.Regions_cls('USmask', numbers, names, abbrevs, [US1_poly, US2_poly])

API reference

This page provides an auto-generated summary of regionmask’s API.

Top-level functions

create_mask_contains(lon, lat, coords[, ...]) create the mask of a list of regions, given the lat and lon coords

Regions_cls

Creating Regions
Regions_cls(name, numbers, names, abbrevs, ...) class for plotting regions and creating region masks
Mapping Number/ Abbreviation/ Name
Regions_cls.map_keys(key) map from names and abbrevs of the regions to numbers
Selecting
Regions_cls.__getitem__(key) subset of Regions or Region
Plotting
Regions_cls.plot([ax, proj, regions, ...]) plot map with with srex regions
Creating a Mask
Regions_cls.mask(lon_or_obj[, lat, ...]) create a grid as mask of a set of regions for given lat/ lon grid
Attributes
Regions_cls.abbrevs list of abbreviations
Regions_cls.names list of long names
Regions_cls.numbers list of the numbers of the regions
Regions_cls.region_ids dict mapping all names and abbrevs to the region number
Regions_cls.coords list of coordinates of the region vertices as numpy array
Regions_cls.polygons list of shapely Polygon/ MultiPolygon of the regions
Regions_cls.centroids list of the center of mass of the regions
Regions_cls._is_polygon is there at least one region that was a Polygon/ MultiPolygon

Region_cls

Creating one Region
Region_cls(number, name, abbrev, outline[, ...]) a single Region, used as member of ‘Regions_cls’
Attributes
Region_cls.coords numpy array of the region
Region_cls.polygon shapely Polygon or MultiPolygon of the region

Private Functions

_wrapAngle360(lon) wrap angle to [0, 360[.
_wrapAngle180(lon) wrap angle to [-180,180[.
_wrapAngle(lon[, wrap_lon]) wrap the angle to the other base

License

regionmask is published under a MIT license.

Indices and tables