Note
This tutorial was generated from an IPython notebook that can be downloaded here.
Create xarray region mask¶
In this tutorial we will show how to create a mask for arbitrary latitude and longitude grids using xarray. It is very similar to the tutorial Create Mask (numpy).
Import regionmask and check the version:
import regionmask
regionmask.__version__
'0.5.0'
Load xarray and the tutorial data:
import xarray as xr
import numpy as np
airtemps = xr.tutorial.load_dataset('air_temperature')
The example data is a temperature field over North America. Let’s plot the first time step:
# load plotting libraries
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
# choose a good projection for regional maps
proj=ccrs.LambertConformal(central_longitude=-100)
ax = plt.subplot(111, projection=proj)
airtemps.isel(time=1).air.plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree())
ax.coastlines();
Conviniently we can directly pass an xarray object to the mask
function. It gets the longitude and latitude from the DataArray/ Dataset
and creates the mask
. If the longituda and latitude in the xarray
object are not called lon
and lat
, respectively; you can pass
their name via the lon_name
and lat_name
keyword. Here we use
the Giorgi regions.
The extent of the grid and the regions is not the same:
lon = airtemps.lon
print("Grid extent: ", lon.values.min(), lon.values.max())
bounds = regionmask.defined_regions.giorgi.bounds_global
print("Region extent: ", bounds[0], bounds[2])
Grid extent: 200.0 330.0
Region extent: -170.0 180.0
Note
From version 0.5 regionmask
automatically detects wether the longitude needs to be wrapped around, i.e. if the regions extend from -180° E to 180° W, while the grid goes from 0° to 360° W.
mask = regionmask.defined_regions.giorgi.mask(airtemps)
This did not work in earlier versions. We had to set wrap_lon=True
.
Let’s plot the mask of the regions:
proj=ccrs.LambertConformal(central_longitude=-100)
ax = plt.subplot(111, projection=proj)
low = mask.min()
high = mask.max()
levels = np.arange(low - 0.5, high + 1)
h = mask.plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree(), levels=levels, add_colorbar=False)
# for colorbar: find abbreviations of all regions that were selected
reg = np.unique(mask.values)
reg = reg[~np.isnan(reg)]
abbrevs = regionmask.defined_regions.giorgi[reg].abbrevs
cbar = plt.colorbar(h, orientation="horizontal", fraction=0.075, pad=0.05)
cbar.set_ticks(reg)
cbar.set_ticklabels(abbrevs)
cbar.set_label("Region")
ax.coastlines()
# fine tune the extent
ax.set_extent([200, 330, 10, 75], crs=ccrs.PlateCarree());
We want to select the region ‘Central North America’. Thus we first need to find out which number this is:
regionmask.defined_regions.giorgi.map_keys('Central North America')
6
Select using where
¶
xarray
provides the handy where
function:
airtemps_CNA = airtemps.where(mask == 6)
Check everything went well by repeating the first plot with the selected region:
# choose a good projection for regional maps
proj=ccrs.LambertConformal(central_longitude=-100)
ax = plt.subplot(111, projection=proj)
airtemps_CNA.isel(time=1).air.plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree())
ax.coastlines();
Looks good - let’s take the area average and plot the time series.
(Note: you should use cos(lat)
weights to correctly calculate an
area average. Unfortunately this is not yet (as of version 0.13)
implemented in xarray.)
ts_airtemps_CNA = airtemps_CNA.mean(dim=('lat', 'lon')) - 273.15
ts_airtemps = airtemps.mean(dim=('lat', 'lon')) - 273.15
# and the line plot
f, ax = plt.subplots()
ts_airtemps_CNA.air.plot.line(ax=ax, label='Central North America')
ts_airtemps.air.plot(ax=ax, label='Entire Domain')
plt.legend(ncol=2);
Select using groupby
¶
# you can group over all integer values of the mask
# you have to take the mean over `stacked_lat_lon`
airtemps_all = airtemps.groupby(mask).mean('stacked_lat_lon')
we can add the abbreviations and names of the regions to the DataArray
# extract the abbreviations and the names of the regions from regionmask
abbrevs = regionmask.defined_regions.giorgi[airtemps_all.region.values].abbrevs
names = regionmask.defined_regions.giorgi[airtemps_all.region.values].names
airtemps_all.coords['abbrevs'] = ('region', abbrevs)
airtemps_all.coords['names'] = ('region', names)
airtemps_all
<xarray.Dataset> Dimensions: (region: 6, time: 2920) Coordinates: * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 * region (region) float64 4.0 5.0 6.0 7.0 8.0 9.0 abbrevs (region) <U3 'CAM' 'WNA' 'CNA' 'ENA' 'ALA' 'GRL' names (region) <U21 'Central America' ... 'Greenland' Data variables: air (region, time) float32 294.71286 293.61765 ... 253.69574 253.83855
now we can select the regions in many ways
# as before, by the index of the region
r1 = airtemps_all.sel(region=6).air
# with the abbreviation
r2 = airtemps_all.isel(region=(airtemps_all.abbrevs == 'WNA')).air.squeeze()
# with the long name
r3 = airtemps_all.isel(region=(airtemps_all.names == 'Eastern North America')).air.squeeze()
regs = [r1, r2, r3]
Now, let’s plot the three selected regions:
f, axes = plt.subplots(3, 1, sharex=True)
for i, reg in enumerate(regs):
ax = axes[i]
reg.plot(ax=ax)
ax.set_title(reg.names.values)
plt.setp(axes, xlabel="")
f.tight_layout()