{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "hide-cell" ] }, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "from matplotlib import rcParams\n", "\n", "rcParams[\"figure.dpi\"] = 300\n", "rcParams[\"font.size\"] = 8\n", "\n", "import warnings\n", "\n", "warnings.filterwarnings(\"ignore\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "{{ prolog }}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Create 2D integer masks\n", "\n", "In this tutorial we will show how to create 2D integer mask for arbitrary latitude and longitude grids." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ":::{tip}\n", "\n", "2D masks are good for plotting. However, to calculate weighted regional averages 3D boolean masks are more convenient. See the [tutorial on 3D masks](mask_3D.ipynb). See [#226](https://github.com/regionmask/regionmask/issues/226) how *weighted* regional averages can be calculated with 2D integer mask (this may also offer some speed gains).\n", "\n", ":::" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import regionmask and check the version:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import regionmask\n", "\n", "regionmask.__version__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load xarray and the tutorial data:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import xarray as xr\n", "\n", "xr.set_options(display_style=\"text\", display_width=60)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating a mask\n", "\n", "Define a lon/ lat grid with a 1° grid spacing, where the points define the center of the grid." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lon = np.arange(-179.5, 180)\n", "lat = np.arange(-89.5, 90)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will create a mask with the SREX regions (Seneviratne et al., [2012](https://www.ipcc.ch/site/assets/uploads/2018/03/SREX-Ch3-Supplement_FINAL-1.pdf))." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "regionmask.defined_regions.srex" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The function `mask` determines which gridpoints lie within the polygon making up the each region:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mask = regionmask.defined_regions.srex.mask(lon, lat)\n", "mask" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`mask` is now a `xarray.Dataset` with shape `lat x lon` (if you need a numpy array use `mask.values`). Gridpoints that do not fall in a region are `NaN`, the gridpoints that fall in a region are encoded with the number of the region (here 1 to 26).\n", "\n", "We can now plot the `mask`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import cartopy.crs as ccrs\n", "import matplotlib.pyplot as plt\n", "\n", "f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.PlateCarree()))\n", "ax.coastlines()\n", "\n", "regionmask.defined_regions.srex.plot(\n", " ax=ax, add_label=False, line_kws=dict(lw=0.5, color=\"0.5\")\n", ")\n", "\n", "mask.plot(ax=ax, transform=ccrs.PlateCarree(), add_colorbar=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Working with a mask\n", "\n", "masks can be used to select data in a certain region and to calculate regional averages - let's illustrate this with a 'real' dataset:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "airtemps = xr.tutorial.load_dataset(\"air_temperature\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The example data is a temperature field over North America. Let's plot the first time step:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# choose a good projection for regional maps\n", "proj = ccrs.LambertConformal(central_longitude=-100)\n", "\n", "ax = plt.subplot(111, projection=proj)\n", "\n", "airtemps.isel(time=1).air.plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree())\n", "\n", "ax.coastlines();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Conveniently we can directly pass an xarray object to the `mask` function. It gets the longitude and latitude from the `DataArray`/ `Dataset` and creates the `mask`. Per default regionmask assumes the longitude and latitude are called `\"lon\"` and `\"lat\"`. If they have another name, you can pass them individually, e.g. `region.mask(ds.longitude, ds.latitude)`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mask = regionmask.defined_regions.srex.mask(airtemps)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ":::{note}\n", "\n", "regionmask automatically detects whether the longitude needs to be wrapped around, i.e. if the regions extend from -180° E to 180° W, while the grid goes from 0° to 360° W as in our example:\n", "\n", ":::" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lon = airtemps.lon.values\n", "print(f\"Grid extent: {lon.min():3.0f}°E to {lon.max():3.0f}°E\")\n", "\n", "bounds = regionmask.defined_regions.srex.bounds_global\n", "print(f\"Region extent: {bounds[0]:3.0f}°E to {bounds[2]:3.0f}°E\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's plot the mask of the regions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "proj = ccrs.LambertConformal(central_longitude=-100)\n", "ax = plt.subplot(111, projection=proj)\n", "\n", "low = mask.min()\n", "high = mask.max()\n", "\n", "levels = np.arange(low - 0.5, high + 1)\n", "\n", "h = mask.plot.pcolormesh(\n", " ax=ax, transform=ccrs.PlateCarree(), levels=levels, add_colorbar=False\n", ")\n", "\n", "# for colorbar: find abbreviations of all regions that were selected\n", "reg = np.unique(mask.values)\n", "reg = reg[~np.isnan(reg)]\n", "abbrevs = regionmask.defined_regions.srex[reg].abbrevs\n", "\n", "cbar = plt.colorbar(h, orientation=\"horizontal\", fraction=0.075, pad=0.05)\n", "\n", "cbar.set_ticks(reg)\n", "cbar.set_ticklabels(abbrevs)\n", "cbar.set_label(\"Region\")\n", "\n", "ax.coastlines()\n", "\n", "# fine tune the extent\n", "ax.set_extent([200, 330, 10, 75], crs=ccrs.PlateCarree())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Select a specific region - using cf_xarray" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If [cf_xarray](https://cf-xarray.readthedocs.io/en/latest/index.html) is installed we can select any individual region using the [flags](https://cf-xarray.readthedocs.io/en/latest/flags.html) saved in the attributes (`attrs`) of the mask (new in regionmask 0.10.0):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mask.cf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Thus to select the region named \"Central North America\" (abbreviated \"CNA\") we can do:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mask_CNA = mask.cf == \"CNA\"\n", "\n", "# show a subset\n", "mask_CNA.sel(lat=30, lon=slice(240, 260))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ":::{warning}\n", "\n", "`flag_meanings` cannot contain spaces - they will be replaced by underscores:\n", "\n", ":::" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mask_names = regionmask.defined_regions.srex.mask(airtemps, flag=\"names\")\n", "\n", "mask_names.cf == \"C. North America\".replace(\" \", \"_\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Select a specific region - without cf_xarray" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We first need to find out which number the region 'C. North America' corresponds to:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "CNA_index = regionmask.defined_regions.srex.map_keys(\"C. North America\")\n", "CNA_index" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mask_CNA = mask == CNA_index" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Mask out a region" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To replace all grid points outside of CNA we use the `where` method of `xr.Dataset` ([documentation](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.where.html)), which filters elements from this object according to a condition:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "airtemps_CNA = airtemps.where(mask_CNA)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check everything went well by repeating the first plot with the selected region:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# choose a good projection for regional maps\n", "proj = ccrs.LambertConformal(central_longitude=-100)\n", "\n", "ax = plt.subplot(111, projection=proj)\n", "\n", "regionmask.defined_regions.srex[[\"CNA\"]].plot(ax=ax, add_label=False)\n", "\n", "airtemps_CNA.isel(time=1).air.plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree())\n", "\n", "ax.coastlines();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Looks good - with this we can calculate the region average.\n", "\n", "### Calculate weighted regional average\n", "\n", "We use the xarray method to calculate the weighted mean with `cos(lat)` as proxy of the grid cell area." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ":::{note}\n", "\n", "It is better to use a model's original grid cell area (e.g. ``areacella``). ``cos(lat)`` works reasonably well for regular lat/ lon grids. For irregular grids (regional models, ocean models, ...) it is not appropriate.\n", "\n", ":::" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "weights = np.cos(np.deg2rad(airtemps.lat))\n", "\n", "ts_airtemps_CNA = airtemps_CNA.weighted(weights).mean(dim=(\"lat\", \"lon\")) - 273.15" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We plot the resulting time series:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f, ax = plt.subplots()\n", "ts_airtemps_CNA.air.plot.line(ax=ax, label=\"Central North America\")\n", "\n", "ax.axhline(0, color=\"0.1\", lw=0.5)\n", "\n", "plt.legend();" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "To get the regional average for each region you would need to loop over them. However, it's easier to use a 3D mask.\n", "\n", "### Calculate regional statistics using `groupby`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ":::{warning}\n", "\n", "Using ``groupby`` offers some convenience and is faster than using ``where`` and a loop. However, xarray does currently not natively support to combine ``groupby`` with ``weighted`` ([pydata/xarray#3937](https://github.com/pydata/xarray/issues/3937)), see [#226](https://github.com/regionmask/regionmask/issues/226) for a workaround.\n", "Overall, I recommend working with a {doc}`3D masks `.\n", "\n", ":::" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# you can group over all integer values of the mask\n", "airtemps_all = airtemps.groupby(mask).mean()\n", "airtemps_all" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, `groupby` is the way to go when calculating a (unweighted) regional median:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# you can group over all integer values of the mask\n", "airtemps_reg_median = airtemps.groupby(mask).median()\n", "airtemps_reg_median.isel(time=0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Multidimensional coordinates\n", "\n", "Regionmask can also handle mutltidimensional longitude/ latitude grids (e.g. from a regional climate model). As xarray provides such an example dataset, we will use it to illustrate it. See also in the [xarray documentation](http://xarray.pydata.org/en/stable/examples/multidimensional-coords.html/).\n", "\n", "Load the tutorial data:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rasm = xr.tutorial.load_dataset(\"rasm\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The example data is a temperature field over the Northern Hemisphere. Let's plot the first time step:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# choose a projection\n", "proj = ccrs.NorthPolarStereo()\n", "\n", "ax = plt.subplot(111, projection=proj)\n", "ax.set_global()\n", "\n", "rasm.isel(time=1).Tair.plot.pcolormesh(\n", " ax=ax, x=\"xc\", y=\"yc\", transform=ccrs.PlateCarree()\n", ")\n", "\n", "# add the abbreviation of the regions\n", "regionmask.defined_regions.srex[[1, 2, 11, 12, 18]].plot(\n", " ax=ax, add_coastlines=False, label=\"abbrev\"\n", ")\n", "\n", "ax.set_extent([-180, 180, 43, 90], ccrs.PlateCarree())\n", "\n", "ax.coastlines();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Again we pass the xarray object to regionmask. Here, cf_xarray is used to detect `rasm.xc` and `rasm.yc` as longitude and latitude coordinates (new in regionmask 0.10.0). Without cf_xarray we would need to pass the coordinates explicitly (i.e. `srex.mask(rasm.xc, rasm.yc`):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mask = regionmask.defined_regions.srex.mask(rasm)\n", "mask" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We want to select the region 'NAS' (Northern Asia).\n", "\n", "### Select using `where`\n", "\n", "Using the cf_xarray accessor (from regionmask 0.10.0) we can directly use the name of the region (otherwise we need to select by the the number of the region):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rasm_NAS = rasm.where(mask.cf == \"NAS\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check everything went well by repeating the first plot with the selected region:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# choose a projection\n", "proj = ccrs.NorthPolarStereo()\n", "\n", "ax = plt.subplot(111, projection=proj)\n", "ax.set_global()\n", "\n", "rasm_NAS.isel(time=1).Tair.plot.pcolormesh(\n", " ax=ax, x=\"xc\", y=\"yc\", transform=ccrs.PlateCarree()\n", ")\n", "\n", "\n", "# add the abbreviation of the regions\n", "regionmask.defined_regions.srex[[\"NAS\"]].plot(\n", " ax=ax, add_coastlines=False, label=\"abbrev\"\n", ")\n", "\n", "ax.set_extent([-180, 180, 45, 90], ccrs.PlateCarree())\n", "\n", "ax.coastlines();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References\n", "\n", "* Special Report on Managing the Risks of Extreme Events and Disasters to Advance Climate Change Adaptation (SREX, Seneviratne et al., [2012](https://www.ipcc.ch/site/assets/uploads/2018/03/SREX-Ch3-Supplement_FINAL-1.pdf))" ] } ], "metadata": { "kernelspec": { "display_name": "regionmask-docs", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.4" } }, "nbformat": 4, "nbformat_minor": 1 }