Note

This tutorial was generated from an IPython notebook that can be downloaded here.

Edge behavior and interiors

This notebook illustrates the edge behavior and how Polygon interiors are treated.

Note

From version 0.5 regionmask treats points on the region borders differently and also considers poygon interiors (holes), e.g. the Caspian Sea in natural_earth.land_110 region.

Preparation

Import regionmask and check the version:

import regionmask
regionmask.__version__
'0.5.0'

Other imports

import xarray as xr
import numpy as np

import cartopy.crs as ccrs
import matplotlib.pyplot as plt

from matplotlib import colors as mplc
from shapely.geometry import Polygon

Define some colors:

cmap1 = mplc.ListedColormap(['#9ecae1'])
cmap2 = mplc.ListedColormap(['#fc9272'])
cmap3 = mplc.ListedColormap(['#cab2d6'])

Methods

Regionmask offers three methods to rasterize regions

  1. rasterize: fastest but only for equally-spaced grid, uses rasterio.features.rasterize internally.

  2. shapely: for irregular grids, uses shapely.vectorized.contains internally.

  3. legacy: old method (deprecated), slowest and with inconsistent edge behaviour

All methods use the lon and lat coordinates to determine if a grid cell is in a region. lon and lat are assumed to indicate the center of the grid cell. Methods (1) and (2) have the same edge behavior and consider ‘holes’ in the regions. Method (3) is deprecated and will be removed in a future version. regionmask automatically determines which method to use.

  1. subtracts a tiny offset from lon and lat to achieve a edge behaviour consistent with (1). Due to mapbox/rasterio/#1844 this is unfortunately also necessary for (1).

Edge behavior

As of version 0.5 regionmask has a new edge behavior - points that fall of the outline of a region are now consistently treated. This was not the case in earlier versions (xref matplotlib/matplotlib#9704). It’s easiest to see the edge behaviour in an

Example

Define a region and a lon/ lat grid, such that some gridpoints lie exactly on the border:

outline = np.array([[ -80., 50.],
                    [ -80., 28.],
                    [-100., 28.],
                    [-100., 50.]])

region = regionmask.Regions([outline])
ds_US = regionmask.core.utils.create_lon_lat_dataarray_from_bounds(
    *(-161, -29, 2),
    *(75, 13, -2)
)

print(ds_US)
<xarray.Dataset>
Dimensions:   (lat: 30, lat_bnds: 31, lon: 65, lon_bnds: 66)
Coordinates:
  * lon       (lon) float64 -160.0 -158.0 -156.0 -154.0 ... -36.0 -34.0 -32.0
  * lat       (lat) float64 74.0 72.0 70.0 68.0 66.0 ... 22.0 20.0 18.0 16.0
  * lon_bnds  (lon_bnds) int64 -161 -159 -157 -155 -153 ... -39 -37 -35 -33 -31
  * lat_bnds  (lat_bnds) int64 75 73 71 69 67 65 63 61 ... 27 25 23 21 19 17 15
    LON       (lat, lon) float64 -160.0 -158.0 -156.0 ... -36.0 -34.0 -32.0
    LAT       (lat, lon) float64 74.0 74.0 74.0 74.0 ... 16.0 16.0 16.0 16.0
Data variables:
    empty

Let’s create a mask with each of these methods:

mask_rasterize = region.mask(ds_US, method="rasterize")
mask_shapely = region.mask(ds_US, method="shapely")
mask_legacy = region.mask(ds_US, method="legacy")

Note

regionmask automatically detects which method to use, so there is no need to specify the method keyword.

Plot the masked regions:

f, axes = plt.subplots(1, 3, subplot_kw=dict(projection=ccrs.PlateCarree()))

opt = dict(add_colorbar=False, ec="0.5", lw=0.5)

mask_rasterize.plot(ax=axes[0], cmap=cmap1, **opt)
mask_shapely.plot(ax=axes[1], cmap=cmap2, **opt)
mask_legacy.plot(ax=axes[2], cmap=cmap3, **opt)


for ax in axes:
    ax = region.plot_regions(ax=ax, add_label=False)
    ax.set_extent([-105, -75, 25, 55], ccrs.PlateCarree())
    ax.coastlines(lw=0.5)

    ax.plot(ds_US.LON, ds_US.lat, "*", color="0.5", ms=0.5, transform=ccrs.PlateCarree())

axes[0].set_title('rasterize')
axes[1].set_title('shapely')
axes[2].set_title('legacy');
../_images/method_17_0.png

Points indicate the grid cell centers (lon and lat), lines the grid cell borders, colored grid cells are selected to be part of the region. The top and right grid cells now belong to the region while the left and bottom grid cells do not. This choice is arbitrary but mimicks what rasterio.features.rasterize does. This can avoid spurios columns of unassigned grid poins as the following example shows.

SREX regions

Create a global dataset:

ds_GLOB = regionmask.core.utils.create_lon_lat_dataarray_from_bounds(
    *(-180, 181, 2),
    *(90, -91, -2)
)
srex = regionmask.defined_regions.srex

srex_new = srex.mask(ds_GLOB)
srex_old = srex.mask(ds_GLOB, method="legacy")
f, axes = plt.subplots(1, 2, subplot_kw=dict(projection=ccrs.PlateCarree()))

opt = dict(add_colorbar=False, cmap="viridis_r")

srex_new.plot(ax=axes[0], **opt)
srex_old.plot(ax=axes[1], **opt)

for ax in axes:
    srex.plot_regions(ax=ax, add_label=False, line_kws=dict(lw=0.5))
    ax.set_extent([-150, -50, 15, 75], ccrs.PlateCarree())
    ax.coastlines(resolution="50m", lw=0.25)

axes[0].set_title('new (rasterize + shapely)')
axes[1].set_title('legacy');
../_images/method_21_0.png

Polygon interiors

Polygons can have interior boundaries (‘holes’). Previously these were not considered and e.g. the Caspian Sea was not ‘unmasked’.

Example

Let’s test this on an example and define a region_with_hole:

interior = np.array([[ -86., 44.],
                     [ -86., 34.],
                     [ -94., 34.],
                     [ -94., 44.],
                     [ -86., 44.],
                    ])

poly = Polygon(outline, [interior])

region_with_hole = regionmask.Regions([poly])
mask_hole_rasterize = region_with_hole.mask(ds_US, method="rasterize")
mask_hole_shapely = region_with_hole.mask(ds_US, method="shapely")
mask_hole_legacy = region_with_hole.mask(ds_US, method="legacy")
f, axes = plt.subplots(1, 3, subplot_kw=dict(projection=ccrs.PlateCarree()))

opt = dict(add_colorbar=False, ec="0.5", lw=0.5)

mask_hole_rasterize.plot(ax=axes[0], cmap=cmap1, **opt)
mask_hole_shapely.plot(ax=axes[1], cmap=cmap2, **opt)
mask_hole_legacy.plot(ax=axes[2], cmap=cmap3, **opt)

for ax in axes:
    region.plot_regions(ax=ax, add_label=False)

    # interiors are not (yet) ploted by default
    ax.plot(*interior.T, color="k")

    ax.set_extent([-105, -75, 25, 55], ccrs.PlateCarree())
    ax.coastlines(lw=0.5)

    ax.plot(ds_US.LON, ds_US.lat, ".", color="0.5", ms=0.5, transform=ccrs.PlateCarree())

axes[0].set_title('rasterize')
axes[1].set_title('shapely')
axes[2].set_title('legacy');
../_images/method_25_0.png

Caspian Sea

land110 = regionmask.defined_regions.natural_earth.land_110

land_new = land110.mask(ds_GLOB)
land_old = land110.mask(ds_GLOB, method="legacy")
f, axes = plt.subplots(1, 2, subplot_kw=dict(projection=ccrs.PlateCarree()))

opt = dict(add_colorbar=False)

land_new.plot(ax=axes[0], cmap=cmap2, **opt)
land_old.plot(ax=axes[1], cmap=cmap3, **opt)

for ax in axes:

    ax.set_extent([15, 75, 15, 55], ccrs.PlateCarree())
    ax.coastlines(resolution="50m", lw=0.5)

    ax.plot(ds_GLOB.LON, ds_GLOB.lat, ".", color="0.5", ms=0.5, transform=ccrs.PlateCarree())

axes[0].set_title('new (rasterize + shapely)')
axes[1].set_title('legacy');
../_images/method_28_0.png

Speedup

The new methods are faster than the old one:

print("Method: rasterize")
%timeit -n 10 region.mask(ds_US, method="rasterize")
print("Method: shapely")
%timeit -n 10 region.mask(ds_US, method="shapely")
print("Method: legacy")
%timeit -n 10 region.mask(ds_US, method="legacy")
Method: rasterize
3.27 ms ± 1.11 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Method: shapely
2.22 ms ± 486 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Method: legacy
3.19 ms ± 1.09 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

While there is not a big difference for this simple example, the difference gets larger for more complex geometries and more gridpoints:

ds_GLOB = regionmask.core.utils.create_lon_lat_dataarray_from_bounds(*(-180, 181, 2), *(90, -91, -2))

countries_110 = regionmask.defined_regions.natural_earth.countries_110

print("Method: rasterize")
%timeit -n 1 countries_110.mask(ds_GLOB, method="rasterize")
print("Method: shapely")
%timeit -n 1  countries_110.mask(ds_GLOB, method="shapely")
print("Method: legacy")
%timeit -n 1  countries_110.mask(ds_GLOB, method="legacy")
Method: rasterize
18.6 ms ± 2.59 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Method: shapely
350 ms ± 28.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Method: legacy
3.45 s ± 347 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)