Exploring coastal water levels and storm surges from a global tide and surge model

Exploring coastal water levels and storm surges from a global tide and surge model#

This notebook can be run on free online platforms, such as Binder, Kaggle and Colab, or they can be accessed from GitHub. The links to run this notebook in these environments are provided here, but please note they are not supported by ECMWF.

Introduction#

This notebook provides a practical introduction on how to access, visualize and analyse GTSMip tide and surge water level timeseries data available in the Climate Data Store (CDS) of the Copernicus Climate Change Service (C3S), developed by Deltares in order to accompany the dataset. This data is the output of the the global tide and surge model GTSMv3.0.

The GTSMip dataset on CDS includes total water level, surge and tidal elevation timeseries, as well as annual mean sea level values (based on sea level rise projections). Next to the reanalysis data forced with ERA5, also model timeseries forced with several high-resolution climate models (HighResMIP) are available for the period of 1950 to 2050. The timeseries are available for a large number of offshore and coastal locations (over 43.000 locations globally). For more information on the full dataset, please refer to the Product User Guide.

In this notebook we use the data from the GTSM reanalysis experiment where the surges were generated using the ERA5 meteorological reanalysis. This data is available for the period from 1950 onwards. We use the reanalysis experiment data to explore historical storm events. Reanalysis experiment data is useful for analyzing the contribution of surge to high coastal water levels, and as input for hydrodynamic modelling on local scale, including scenario modelling.

Data description#

This notebook introduces you to the GTSMip tide and surge water level timeseries. The datasets used in the notebook have the following specifications:

Variable: mean sea level, storm residual, tidal elevation and total water level
Experiment: Historical, future or reanalysis
Model: climate models (not applicable to reanalysis data)
Temporal aggregation: temporal frequency of the data: hourly, 10-min or daily maxima
Year: Selection of year, for reanalysis model output 1950 to 2023 are available
Month: selection of months (January to December)

Prepare your environment#

Setup the CDSAPI and your credentials#

The code below will ensure that the cdsapi package is installed. If you have not setup your ~/.cdsapirc file with your credenials, you can replace None with your credentials that can be found on the how to api page (you will need to log in to see your credentials).

!pip install -q cdsapi
# If you have already setup your .cdsapirc file you can leave this as None
cdsapi_key = None
cdsapi_url = None

Install and import libraries#

The following code block will import the python modules required for this notebook. We will be using cdsapi for downloading the data and xarray for handling the data, and matplotlib and cartopy for plotting.

# General libraries for file paths, data extraction, etc
from glob import glob
import os
import zipfile
import urllib3

# CDS API
import cdsapi

# Libraries for working with multi-dimensional arrays
import numpy as np
import xarray as xr

# Libraries for plotting and visualising data
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature

# Libraries for dealing with time variables
from datetime import datetime

urllib3.disable_warnings()  # Disable warnings for data download via API

Setup directories to store the data and output#

DATADIR = './data_dir/'
if not os.path.exists(DATADIR):
    os.makedirs(DATADIR)

OUTDIR = os.path.join(DATADIR, 'output')
if not os.path.exists(OUTDIR):
    os.makedirs(OUTDIR)

Select and retrieve data from the CDS using the CDS API#

Select storm event and location of interest#

First we need to specify the directories in which we can download our data and store the outputs:

For the purposes of this exercise, we focus on a past event that led to high coastal water levels and increased risk of flooding. We specify the name of the storm, its year and the coordinates of a coastal location where we would like to see the water level data, as well as the timeframe for visualization (needs to correspond to the period in which the storm impact has occurred).

Below three options were prepared already, and you can select the storm to be visualized in this notebook by changing the storm variable. The options pre-defined below are storms Xyntia (2010), Ciara (2020) and Pia (2023). However, as a user you can add custom options for other storms and locations by adding year and coordinates of the location of interest, as well as the timeframe that the user would like to visualize. The coordinates are used to snap to the nearest model output location. Please note, that the coordinates need to be located at the open coast (i.e. not inside a narrow eastuary), keeping in mind that GTSM is a global model with a coastal resolution of 1.25 km along European coasts. For a visualization of the available output points from GTSM see the plots that are generated further on in this notebook.

We have selected storms Xyntia (2010), Ciara (2020) and Pia (2023) to illustrate the analysis of the global tide and surge model data because these storms are relatively recent and had significant impact on the European coasts.

Storm Xyntia (2010) was an exceptionally violent storm that impacted Western European countries in the end of February to beginning of March 2010. In terms of coastal impacts, the storm has caused most destructive impact at the western coast of France, in the area surrounding La Rochelle (departments Vendée and Charente-Maritime). The coastal water levels were exceptionally high during this storm, measuring approximately 4.1 m above mean sea level (MSL) at the tide gauge located in the Marina of La Rochelle (source), which has led to damages of coastal infrastructure and flooding. In the example in this notebook we focus on a location close to this tide gauge at the coast of La Rochelle.

Storm Xavier (2013) was a winter storm in the North Sea storm causing severe coastal flooding at the UK coasts, breaching coastal deferences and inundating housing and farmlands. One of the impacted regions was the Humber estuary at the West coast of the UK. Water levels during the storm were measured using tide gauge at Immingham at the entrance to the Humber estuary, recording maximum water levels of 5.2 m above ODN (Ordnance Datum Newlyn, approx. equal to MSL), and surge levels of 1.97 m (source).

Storm Pia (2023) mainly impacted the United Kingdom, The Netherlands, Scandinavia and Belgium and Germany. This storm caused major disruption and flooding. In The Netherlands, this storm has notably triggered closure of the Maeslantkering storm surge barrier at the Port of Rotterdam, for the first time in the history of this barrier (in operation since 1997). The barrier closure was triggered when the water levels exceeded 3m above NAP (Normaal Amsterdams Peil, approx. equal to MSL). In the example in this notebook we look at the location close to the entrace to the Port of Rotterdam to examine the water levels.

# Select storm (see next cell for options)
storm = 'Pia'

# Storm Xyntia (2010)
if storm == 'Xyntia':
    year = 2010
    latlon = [46.16, -1.24]  # West coast of France near L'Aiguillon-la-Presqu'île
    starttime = '19 February'
    endtime = '7 March'

# Storm Xavier (2013)
elif storm == 'Xavier':
    year = 2013
    latlon = [53.63, -0.17]  # West coast of the UK near Humber estuary
    starttime = '2 December'
    endtime = '11 December'

# Storm Pia (2023)
elif storm == 'Pia':
    year = 2023
    latlon = [51.98, 4.075]  # South Netherlands coast close to Port of Rotterdam
    starttime = '13 December'
    endtime = '29 December'

Request data from CDS#

We can request data from the Climate Data Store (CDS) with the help of the CDS API. The CDS API credentials can be either set manually below (by changing KEY argument). The string of characters that make up your KEY include your personal User ID and CDS API key. To obtain these, first register or login to the CDS (https://cds.climate.copernicus.eu), then visit https://cds.climate.copernicus.eu/how-to-api and copy the string of characters listed after “key:”. Replace the None below with this string. Please note: if you already have the CDS API configured on your machine, you can leave KEY = None.

URL = 'https://cds.climate.copernicus.eu/api'
KEY = None

The next step is then to request the data with the help of the CDS API. We download the following data to be used in this notebook:

for water levels and surges hourly data for the full year that corresponds to the selected storm - we will retrieve this data to compute yearly statistics and visualise water levels during the storm;
tidal elevation at 10 min temporal resolution for the specified period relevant to the selected storm - this data is used to visually inspect the contribution of tidal elevation to total water levels during the storm;
Mean sea level for multiple years to better understand the vertical reference of the data and the contribution of sea level rise.

At the moment it is not possible to download the data within a geografical bounding box, therefore we download the full dataset for the given year. In total, this notebook will download about 300 MB of data.

Before you run the cells below, the terms and conditions on the use of the data need to have been accepted in the CDS. You can view and accept these conditions by logging into the CDS, searching for the dataset, then scrolling to the end of the Download data section.

Note

For more information about data access through the Climate Data Store, please see the CDS user guide here.

# Make connection with CDS via API
c = cdsapi.Client(url=URL, key=KEY)

2025-08-22 10:32:39,085 INFO [2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.

Note

If the data request does not run successfully, please check the CDS system status page. If the problem persists, consider reaching out to CDS Support.

# Download data - water level and surge
filename = os.path.join(DATADIR, f'water_levels_{year}.zip')
c.retrieve(
    'sis-water-level-change-timeseries-cmip6',
    {
        'variable': ['total_water_level', 'storm_surge_residual'],
        'experiment': 'reanalysis',
        'temporal_aggregation': ['hourly'],
        'year': year,
        'month': [f'{mm:02}' for mm in range(datetime.strptime(starttime + f' {year}', '%d %B %Y').month,
                                              datetime.strptime(endtime + f' {year}', '%d %B %Y').month+1)],
        'version': ["v3"],
        'format': 'zip',
    },
    filename)

2025-08-22 10:41:51,350 INFO [2025-04-17T00:00:00] A new version has been added requiring a change to the download form. Users are advised that CDS API requests will need to be updated to accommodate this change. Please see the known issues under the Documentation tab for more information regarding the version update.
2025-08-22 10:41:51,350 INFO Request ID is 53d90b3d-6474-4d7c-a24c-f334dc13f089
2025-08-22 10:41:51,409 INFO status has been updated to accepted
2025-08-22 10:42:05,162 INFO status has been updated to running
2025-08-22 10:42:12,788 INFO status has been updated to successful
                                                                                          

'./data_dir/water_levels_2023.zip'

# Download data - tide
filename = os.path.join(DATADIR, f'tide_{year}.zip')
c.retrieve(
    'sis-water-level-change-timeseries-cmip6',
    {
        'variable': ['tidal_elevation'],
        'experiment': ['historical' if year <= 2014 else 'future'],
        'temporal_aggregation': ['10_min'],
        'year': year,
        'month': [f'{mm:02}' for mm in range(datetime.strptime(starttime + f' {year}', '%d %B %Y').month,
                                              datetime.strptime(endtime + f' {year}', '%d %B %Y').month+1)],
        'version': ["v1"],
        'format': 'zip',
    },
    filename)

2025-08-22 10:43:11,283 INFO [2025-04-17T00:00:00] A new version has been added requiring a change to the download form. Users are advised that CDS API requests will need to be updated to accommodate this change. Please see the known issues under the Documentation tab for more information regarding the version update.
2025-08-22 10:43:11,283 INFO Request ID is e60cb44a-f9b0-4160-941a-fe94f14c9d6b
2025-08-22 10:43:11,307 INFO status has been updated to accepted
2025-08-22 10:43:24,870 INFO status has been updated to running
2025-08-22 10:43:32,553 INFO status has been updated to successful
                                                                                        

'./data_dir/tide_2023.zip'

# Download data - Mean Sea Level
filename = os.path.join(DATADIR, 'MSL.zip')
c.retrieve(
    'sis-water-level-change-timeseries-cmip6',
    {
        'variable': ['mean_sea_level'],
        'experiment': ['historical', 'future'],
        'temporal_aggregation': ['annual'],
        'year': [str(x) for x in list(range(1985, year+1))],
        'version': ["v1"],
        'format': 'zip',
    },
    filename)

2025-08-22 11:46:45,514 INFO [2025-04-17T00:00:00] A new version has been added requiring a change to the download form. Users are advised that CDS API requests will need to be updated to accommodate this change. Please see the known issues under the Documentation tab for more information regarding the version update.
2025-08-22 11:46:45,514 INFO Request ID is 004ce407-2bd3-4caf-8491-f48d045a6da5
2025-08-22 11:46:45,643 INFO status has been updated to accepted
2025-08-22 11:47:18,132 INFO status has been updated to successful
                                                                                          

'./data_dir/MSL.zip'

Unzip the data#

From the CDS, the GTSMip water level data are available as netCDF files compressed into zip archives. For this reason, before we can load any data, we have to extract the files. We can use the functions from the zipfile Python package to extract their contents. For each zip file we first construct a ZipFile() object, then we apply the function extractall() to extract its content.

zip_paths = glob(f'{DATADIR}*.zip')
for j in zip_paths:
    with zipfile.ZipFile(j, 'r') as zObject:
        zObject.extractall(path=DATADIR)
    os.remove(j)

Create a list of the retrieved data files#

The data was downloaded in separate files, each corresponding to a variable and a month of the year or full year (in the case of MSL). To facilitate data analysis later in the tutorial, we can create a list of the extracted netCDF files for water level, surge and tide variables:

gtsm_wl_nc_rel = glob(f'{DATADIR}*waterlevel*hourly*{year}*.nc')
gtsm_surge_nc_rel = glob(f'{DATADIR}*surge*hourly*{year}*.nc')
gtsm_tide_nc_rel = glob(f'{DATADIR}*tide*{year}*.nc')

gtsm_wl_nc = [os.path.basename(i) for i in gtsm_wl_nc_rel]
gtsm_surge_nc = [os.path.basename(i) for i in gtsm_surge_nc_rel]
gtsm_tide_nc = [os.path.basename(i) for i in gtsm_tide_nc_rel]

We can inspect these lists by printing their elements - filenames of the extracted netCDF files:

print('Water level timeseries files:')
print(gtsm_wl_nc)
print('\nSurge timeseries files:')
print(gtsm_surge_nc)
print('\nSurge timeseries files:')
print(gtsm_tide_nc)

Water level timeseries files:
['reanalysis_waterlevel_hourly_2023_01_v2.nc', 'reanalysis_waterlevel_hourly_2023_02_v2.nc', 'reanalysis_waterlevel_hourly_2023_03_v2.nc', 'reanalysis_waterlevel_hourly_2023_04_v2.nc', 'reanalysis_waterlevel_hourly_2023_05_v2.nc', 'reanalysis_waterlevel_hourly_2023_06_v2.nc', 'reanalysis_waterlevel_hourly_2023_07_v2.nc', 'reanalysis_waterlevel_hourly_2023_08_v2.nc', 'reanalysis_waterlevel_hourly_2023_09_v2.nc', 'reanalysis_waterlevel_hourly_2023_10_v2.nc', 'reanalysis_waterlevel_hourly_2023_11_v2.nc', 'reanalysis_waterlevel_hourly_2023_12_v2.nc']

Surge timeseries files:
['reanalysis_surge_hourly_2023_01_v2.nc', 'reanalysis_surge_hourly_2023_02_v2.nc', 'reanalysis_surge_hourly_2023_03_v2.nc', 'reanalysis_surge_hourly_2023_04_v2.nc', 'reanalysis_surge_hourly_2023_05_v2.nc', 'reanalysis_surge_hourly_2023_06_v2.nc', 'reanalysis_surge_hourly_2023_07_v2.nc', 'reanalysis_surge_hourly_2023_08_v2.nc', 'reanalysis_surge_hourly_2023_09_v2.nc', 'reanalysis_surge_hourly_2023_10_v2.nc', 'reanalysis_surge_hourly_2023_11_v2.nc', 'reanalysis_surge_hourly_2023_12_v2.nc']

Surge timeseries files:
['future_tide_2023_12_v1.nc']

Load and visualize GTSMip water level data points on a map#

Now that we have downloaded and extracted the data, we can inspect this data and visualize the data points on the map. This allows us to understand the spatial resolution of the available data.

Open multifile dataset of water level and surge data for the given year
Calculate statistical values to visualize (99th percentile)
Visualize data points on the global and European maps

In this section we apply these steps to both the surge and water level datasets.

Open multifile dataset of all water level data for the given year#

We begin by opening the downloaded water level timeseries netCDF files in our list. We can use the Python library xarray and its function open_mfdataset to read in multiple netCDF files.

The result is a xarray.Dataset object with two dimensions: time and stations. Each station correspond to coordinates station_x_coordinate and station_y_coordinate. By looking at the contents of the dataset we see that it contains 43119 stations. Please note, that after this step the dataset is opened, but not yet loaded into memory.

# Open multi-file dataset
ds_wl = xr.open_mfdataset([os.path.join(DATADIR, x) for x in gtsm_wl_nc])

# Rearrange how the dataset is loaded into memory during computation to optimize processing
ds_wl = ds_wl.chunk({"time": -1, 'stations': 'auto'})

# Show the contents of the dataset
ds_wl

<xarray.Dataset> Size: 3GB
Dimensions:               (time: 8760, stations: 43119)
Coordinates:
    station_x_coordinate  (stations) float64 345kB dask.array<chunksize=(43119,), meta=np.ndarray>
    station_y_coordinate  (stations) float64 345kB dask.array<chunksize=(43119,), meta=np.ndarray>
  * time                  (time) datetime64[ns] 70kB 2023-01-01 ... 2023-12-3...
  * stations              (stations) uint16 86kB 0 1 2 3 ... 43731 43732 43733
Data variables:
    waterlevel            (time, stations) float64 3GB dask.array<chunksize=(8760, 1915), meta=np.ndarray>
Attributes: (12/35)
    Conventions:                   CF-1.6
    featureType:                   timeSeries
    id:                            GTSMv3_totalwaterlevels
    naming_authority:              https://deltares.nl/en
    Metadata_Conventions:          Unidata Dataset Discovery v1.0
    title:                         Hourly timeseries of total water levels
    ...                            ...
    geospatial_vertical_max:       7.978
    geospatial_vertical_units:     m
    geospatial_vertical_positive:  up
    time_coverage_start:           2023-01-01 00:00:00
    time_coverage_end:             2023-01-31 23:00:00
    experiment:                    reanalysis

We can find out more about the dataset from the Attributes of the dataset and that of the individual variables. Such information includes the short description of the dataset, units, coordinate system etc.:

print('Title: ' + ds_wl.attrs['title'])
print('Summary: ' + ds_wl.attrs['summary'])
print('Coordinate system: ' + ds_wl.station_x_coordinate.attrs['crs'])

Title: Hourly timeseries of total water levels
Summary: This dataset has been produced with the Global Tide and Surge Model (GTSM) version 3.0. GTSM was forced with wind speed and pressure fields from ERA5 climate reanalysis.
Coordinate system: EPSG:4326

Now we can also open the surge level dataset in a similar way and name it ds_sur, and open the dataset of tidal levels and call it ds_tide:

# Load surge levels similar to water levels
ds_sur = xr.open_mfdataset([os.path.join(DATADIR, x) for x in gtsm_surge_nc])
ds_sur = ds_sur.chunk({"time": -1, 'stations': 'auto'})
ds_sur

<xarray.Dataset> Size: 3GB
Dimensions:               (time: 8760, stations: 43119)
Coordinates:
    station_x_coordinate  (stations) float64 345kB dask.array<chunksize=(43119,), meta=np.ndarray>
    station_y_coordinate  (stations) float64 345kB dask.array<chunksize=(43119,), meta=np.ndarray>
  * time                  (time) datetime64[ns] 70kB 2023-01-01 ... 2023-12-3...
  * stations              (stations) uint16 86kB 0 1 2 3 ... 43731 43732 43733
Data variables:
    surge                 (time, stations) float64 3GB dask.array<chunksize=(8760, 1915), meta=np.ndarray>
Attributes: (12/35)
    Conventions:                   CF-1.6
    featureType:                   timeSeries
    id:                            GTSMv3_surge
    naming_authority:              https://deltares.nl/en
    Metadata_Conventions:          Unidata Dataset Discovery v1.0
    title:                         Hourly timeseries of surge levels
    ...                            ...
    geospatial_vertical_max:       2.449
    geospatial_vertical_units:     m
    geospatial_vertical_positive:  up
    time_coverage_start:           2023-01-01 00:00:00
    time_coverage_end:             2023-01-31 23:00:00
    experiment:                    reanalysis

# Load tidal elevation data
ds_tide = xr.open_mfdataset([os.path.join(DATADIR, x) for x in gtsm_tide_nc])
ds_tide = ds_tide.chunk({"time": -1, 'stations': 'auto'})
ds_tide

<xarray.Dataset> Size: 2GB
Dimensions:               (time: 4464, stations: 43119)
Coordinates:
    station_x_coordinate  (stations) float64 345kB dask.array<chunksize=(43119,), meta=np.ndarray>
    station_y_coordinate  (stations) float64 345kB dask.array<chunksize=(43119,), meta=np.ndarray>
  * time                  (time) datetime64[ns] 36kB 2023-12-01 ... 2023-12-3...
  * stations              (stations) uint16 86kB 0 1 2 3 ... 43731 43732 43733
Data variables:
    tide                  (time, stations) float64 2GB dask.array<chunksize=(4464, 3758), meta=np.ndarray>
Attributes: (12/35)
    Conventions:                   CF-1.6
    featureType:                   timeSeries
    id:                            GTSMv3_tides
    naming_authority:              https://deltares.nl/en
    Metadata_Conventions:          Unidata Dataset Discovery v1.0
    title:                         10-minute timeseries of tide levels
    ...                            ...
    time_coverage_start:           2023-12-01 00:00:00
    time_coverage_end:             2023-12-31 23:50:00
    experiment:                    
    date_modified:                 2021-05-06 15:38:43.806307 UTC
    contact:                       Please contact Copernicus User Support on ...
    history:                       This is version 1 of the dataset

Take home messages#

In conclusion, in this notebook we have explored the GTSMip dataset and applied it to better understand the dynamics of water levels during severe winter storms in Europe. By modifying and expanding on the examples provided in this notebook you can perform your own analysis for other locations.

This project is licensed under APACHE License 2.0. | View on GitHub

Exploring coastal water levels and storm surges from a global tide and surge model

Contents

Exploring coastal water levels and storm surges from a global tide and surge model#

Introduction#

Data description#

Prepare your environment#

Setup the CDSAPI and your credentials#

Install and import libraries#

Setup directories to store the data and output#

Select and retrieve data from the CDS using the CDS API#

Select storm event and location of interest#

Request data from CDS#

Unzip the data#

Create a list of the retrieved data files#

Load and visualize GTSMip water level data points on a map#

Open multifile dataset of all water level data for the given year#

Calculating statistics and visualizing data on a map#

Calculating quantiles of water levels and surges#

Visualize data points on the global and European maps#

Visualize timeseries of high water levels and surges for a specific location#

Understanding vertical reference level of the data#

Take home messages#