{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "hide-cell" ] }, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "from matplotlib import rcParams\n", "\n", "rcParams[\"figure.dpi\"] = 300\n", "rcParams[\"font.size\"] = 8\n", "\n", "import warnings\n", "\n", "warnings.filterwarnings(\"ignore\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "{{ prolog }}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Create 3D boolean masks\n", "\n", "In this tutorial we will show how to create 3D boolean masks for arbitrary latitude and longitude grids. It uses the same algorithm to determine if a gridpoint is in a region as for the 2D mask. However, it returns a `xarray.Dataset` with shape `region x lat x lon`, gridpoints that do not fall in a region are `False`, the gridpoints that fall in a region are `True`.\n", "\n", "3D masks are convenient as they can be used to directly calculate weighted regional means (over all regions) using xarray v0.15.1 or later. Further, the mask includes the region names and abbreviations as non-dimension coordinates." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ":::{tip}\n", "\n", "3D boolean masks are more convenient to work with but [2D masks](mask_2D.ipynb) may offer some speed gains when calculating regional means. See [226](https://github.com/regionmask/regionmask/issues/226) how *weighted* regional averages can be calculated with 2D integer mask.\n", "\n", ":::" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import regionmask and check the version:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import regionmask\n", "\n", "regionmask.__version__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load xarray and numpy:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import xarray as xr\n", "\n", "# don't expand data\n", "xr.set_options(display_style=\"text\", display_expand_data=False, display_width=60)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating a mask\n", "\n", "Define a lon/ lat grid with a 1° grid spacing, where the points define the center of the grid:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lon = np.arange(-179.5, 180)\n", "lat = np.arange(-89.5, 90)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will create a mask with the SREX regions (Seneviratne et al., 2012)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "regionmask.defined_regions.srex" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The function `mask_3D` determines which gripoints lie within the polygon making up each region:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mask = regionmask.defined_regions.srex.mask_3D(lon, lat)\n", "mask" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As mentioned, `mask` is a boolean `xarray.Dataset` with shape `region x lat x lon`. It contains `region` (=`numbers`) as dimension coordinate as well as `abbrevs` and `names` as non-dimension coordinates (see the xarray docs for the details on the [terminology](http://xarray.pydata.org/en/stable/terminology.html)).\n", "\n", "## Plotting\n", "\n", "### Plotting individual layers\n", "\n", "The four first layers look as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import cartopy.crs as ccrs\n", "import matplotlib.pyplot as plt\n", "from matplotlib import colors as mplc\n", "\n", "cmap1 = mplc.ListedColormap([\"none\", \"#9ecae1\"])\n", "\n", "fg = mask.isel(region=slice(4)).plot(\n", " subplot_kws=dict(projection=ccrs.PlateCarree()),\n", " col=\"region\",\n", " col_wrap=2,\n", " transform=ccrs.PlateCarree(),\n", " add_colorbar=False,\n", " aspect=1.5,\n", " cmap=cmap1,\n", ")\n", "\n", "for ax in fg.axes.flatten():\n", " ax.coastlines()\n", "\n", "fg.fig.subplots_adjust(hspace=0, wspace=0.1);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting flattened masks\n", "\n", "A 3D mask cannot be directly plotted - it needs to be flattened first. To do this regionmask offers a convenience function: `regionmask.plot_3D_mask`. The function takes a 3D mask as argument, all other keyword arguments are passed through to `xr.plot.pcolormesh`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "regionmask.plot_3D_mask(mask, add_colorbar=False, cmap=\"plasma\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Working with a 3D mask\n", "\n", "masks can be used to select data in a certain region and to calculate regional averages - let's illustrate this with a 'real' dataset:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "airtemps = xr.tutorial.load_dataset(\"air_temperature\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The example data is a temperature field over North America. Let's plot the first time step:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# choose a good projection for regional maps\n", "proj = ccrs.LambertConformal(central_longitude=-100)\n", "\n", "ax = plt.subplot(111, projection=proj)\n", "\n", "airtemps.isel(time=1).air.plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree())\n", "\n", "ax.coastlines()\n", "None" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An xarray object can be passed to the `mask_3D` function:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mask_3D = regionmask.defined_regions.srex.mask_3D(airtemps)\n", "mask_3D" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Per default this creates a `mask` containing one layer (slice) for each region containing (at least) one gridpoint. As the example data only has values over Northern America we only get only 6 layers even though there are 26 SREX regions. To obtain all layers specify `drop=False`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mask_full = regionmask.defined_regions.srex.mask_3D(airtemps, drop=False)\n", "mask_full" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note `mask_full` now has 26 layers." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Select a region\n", "\n", "As `mask_3D` contains `region`, `abbrevs`, and `names` as (non-dimension) coordinates we can use each of those to select an individual region:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# 1) by the index of the region:\n", "r1 = mask_3D.sel(region=3)\n", "\n", "# 2) with the abbreviation\n", "r2 = mask_3D.isel(region=(mask_3D.abbrevs == \"WNA\"))\n", "\n", "# 3) with the long name:\n", "r3 = mask_3D.isel(region=(mask_3D.names == \"E. North America\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This also applies to the regionally-averaged data below. \n", "\n", "It is currently not possible to use `sel` with a non-dimension coordinate - to directly select `abbrev` or `name` you need to create a `MultiIndex`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mask_3D.set_index(regions=[\"region\", \"abbrevs\", \"names\"]);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Mask out a region\n", "\n", "Using `where` a specific region can be 'masked out' (i.e. all data points outside of the region become `NaN`):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "airtemps_cna = airtemps.where(r1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Which looks as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "proj = ccrs.LambertConformal(central_longitude=-100)\n", "\n", "ax = plt.subplot(111, projection=proj)\n", "\n", "airtemps_cna.isel(time=1).air.plot(ax=ax, transform=ccrs.PlateCarree())\n", "\n", "ax.coastlines()\n", "None" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We could now use `airtemps_cna` to calculate the regional average for 'Central North America'. However, there is a more elegant way.\n", "\n", "### Calculate weighted regional averages\n", "\n", "Using the 3-dimensional mask it is possible to calculate weighted averages of all regions in one go, using the `weighted` method (requires xarray 0.15.1 or later). As proxy of the grid cell area we use `cos(lat)`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ":::{note}\n", "\n", "It is better to use a model's original grid cell area (e.g. ``areacella``). ``cos(lat)`` works reasonably well for regular lat/ lon grids. For irregular grids (regional models, ocean models, ...) it is not appropriate.\n", "\n", ":::" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "weights = np.cos(np.deg2rad(airtemps.lat))\n", "\n", "ts_airtemps_regional = airtemps.weighted(mask_3D * weights).mean(dim=(\"lat\", \"lon\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's break down what happens here. By multiplying `mask_3D * weights` we get a DataArray where gridpoints not in the region get a weight of 0. Gridpoints within a region get a weight proportional to the gridcell area. `airtemps.weighted(mask_3D * weights)` creates an xarray object which can be used for weighted operations. From this we calculate the weighted `mean` over the lat and lon dimensions. The resulting dataarray has the dimensions `region x time`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ts_airtemps_regional" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The regionally-averaged time series can be plotted:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ts_airtemps_regional.air.plot(col=\"region\", col_wrap=3);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Restrict the mask to land points\n", "\n", "Combining the mask of the regions with a land-sea mask we can create a land-only mask using the `land_110` region from NaturalEarth." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ":::{warning}\n", "\n", "It is better to use a model's original land/ sea mask (e.g. ``sftlf``). Many CMIP models treat the Antarctic ice shelves and the Caspian Sea as land, while it is classified as 'water' in ``natural_earth_v5_0_0.land_110``.\n", "\n", ":::" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With this caveat in mind we can create the land-sea mask:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "land_110 = regionmask.defined_regions.natural_earth_v5_0_0.land_110\n", "\n", "land_mask = land_110.mask_3D(airtemps)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "and plot it" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "proj = ccrs.LambertConformal(central_longitude=-100)\n", "\n", "ax = plt.subplot(111, projection=proj)\n", "\n", "land_mask.squeeze().plot.pcolormesh(\n", " ax=ax, transform=ccrs.PlateCarree(), cmap=cmap1, add_colorbar=False\n", ")\n", "\n", "ax.coastlines();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To create the combined mask we multiply the two:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mask_lsm = mask_3D * land_mask.squeeze(drop=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note the `.squeeze(drop=True)`. This is required to remove the `region` dimension from `land_mask`.\n", "\n", "Finally, we compare the original mask with the one restricted to land points:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f, axes = plt.subplots(1, 2, subplot_kw=dict(projection=proj))\n", "\n", "ax = axes[0]\n", "mask_3D.sel(region=2).plot(\n", " ax=ax, transform=ccrs.PlateCarree(), add_colorbar=False, cmap=cmap1\n", ")\n", "ax.coastlines()\n", "ax.set_title(\"Regional mask: all points\")\n", "\n", "ax = axes[1]\n", "mask_lsm.sel(region=2).plot(\n", " ax=ax, transform=ccrs.PlateCarree(), add_colorbar=False, cmap=cmap1\n", ")\n", "ax.coastlines()\n", "ax.set_title(\"Regional mask: land only\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References\n", "\n", "* Special Report on Managing the Risks of Extreme Events and Disasters to Advance Climate Change Adaptation (SREX, Seneviratne et al., [2012](https://www.ipcc.ch/site/assets/uploads/2018/03/SREX-Ch3-Supplement_FINAL-1.pdf))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" } }, "nbformat": 4, "nbformat_minor": 4 }