Tutorial: Data Access and Visualisation of Lake Water Level (LWL) timeseries

Tutorial: Data Access and Visualisation of Lake Water Level (LWL) timeseries#

This notebook can be run on free online platforms, such as Binder, Kaggle and Colab, or they can be accessed from GitHub. The links to run this notebook in these environments are provided here, but please note they are not supported by ECMWF.

Introduction#

The Copernicus Climate Change Service (C3S) is a European program with the objective of providing information on the current and past state of the climate, as well as future predictions and projections. All this information is freely available through the Climate Data Store (CDS) (https://cds.climate.copernicus.eu).

One of the datasets available from CDS provides lake water level information derived from satellite radar altimetry for selected lakes all around the world. This technique, originally conceived to study open ocean processes, have acquired numerous useful measurements over lakes. The Lake Water Level is defined as the height, in meters above the geoid (the shape that the surface would take under the influence of the gravity and rotation of Earth) of the reflecting surface. For each of the key lakes monitored within the C3S Lakes production system, the Level-3 Water Level is provided in separate file along with the corresponding uncertainty.

One of the characteristics of the lake water level product is its irregular time step. A water level is estimated each time the lake is observed by a mission/track and, depending on the size and location of the lake, it may be monitored by multiple missions/tracks, with different repetitivities. An overview of the product characteristics is provided in the Product User Specification Guide (PUGS).

Learning Objectives 🎯#

The objective of this tutorial is to explain how to access the water level data for a specified lake and how to visualise its timeseries. The tutorial consists of four main steps: first, you download and decompress data, then you visualise the location of the lake in a map, visualise and save water level timeseries and finally save the yearly water level anomalies.

Prepare your environment#

The first thing we need to do, is make sure you are ready to run the code while working through this notebook. There are some simple steps you need to do to get yourself set up.

Set up CDSAPI and your credentials#

The code below will ensure that the cdsapi package is installed. If you have not setup your ~/.cdsapirc file with your credentials, you can replace None with your credentials that can be found on the how to api page (you will need to log in to see your credentials).

!pip install -q cdsapi
# If you have already setup your .cdsapirc file you can leave this as None
cdsapi_key = None
cdsapi_url = None

(Install and) Import libraries#

The lake water level data is downloaded in a zip file containing a NetCDF file. Some libraries are needed to download data from the CDS, but also to manage zip files (zipfile library) and NetCDF files (xarray library). Other libraries are used to plot and visualise the data (matplotlib and cartopy libraries).

Additional libraries: os and glob are used for file management

import cdsapi
import os
import glob
import zipfile
import xarray as xr
import numpy as np
import matplotlib.pyplot as plt
from cartopy import crs, feature

Specify data directory#

# Directory to store data
# Please ensure that data_dir is a location where you have write permissions
DATADIR = './data_dir/'
# Create this directory if it doesn't exist
os.makedirs(DATADIR, exist_ok=True)

Explore data#

The current version of the Lake Water Level dataset in CDS (v4.0) contains timeseries from 311 lakes.

available_lakes =  [
            'achit', 'alakol', 'albert',
            'amadjuak', 'american_falls', 'aqqikol_hu',
            'argentino', 'athabasca', 'ayakkum',
            'aydarkul', 'aylmer', 'azhibeksorkoli',
            'bagre', 'baikal', 'bairab',
            'baker', 'balbina', 'balkhash',
            'bangweulu', 'bankim', 'baunt',
            'beas', 'beysehir', 'bienville',
            'big_trout', 'birch', 'biylikol',
            'bluenose', 'bodensee', 'bogoria',
            'bosten', 'bratskoye', 'bugunskoye',
            'cahora_bassa', 'caribou', 'caspian',
            'cayuga', 'cedar', 'cerros_colorados',
            'chagbo_co', 'chapala', 'chardarya',
            'chatyrkol', 'chishi', 'chlya',
            'chocon', 'chukochye', 'cienagachilloa',
            'claire', 'cochrane', 'corangamite',
            'cuodarima', 'dagze_co', 'dalai',
            'danau_towuti', 'danausingkarak', 'dangqiong',
            'des_bois', 'dogaicoring_q', 'dorgon',
            'dorsoidong_co', 'dubawnt', 'edouard',
            'egridir', 'erie', 'faber',
            'fitri', 'fontana', 'fort_peck',
            'garkung', 'george', 'gods',
            'grande_trois', 'greatslave', 'guri',
            'gyaring_co', 'gyeze_caka', 'habbaniyah',
            'hala', 'hamrin', 'har',
            'hawizeh_marshes', 'heishi_beihu', 'hendrik_verwoerd',
            'hinojo', 'hoh_xil_hu', 'hongze',
            'hottah', 'hovsgol', 'hulun',
            'huron', 'hyargas', 'iliamna',
            'illmen', 'inarinjarvi', 'issykkul',
            'iznik', 'jayakwadi', 'kabele',
            'kabwe', 'kainji', 'kairakum',
            'kamilukuak', 'kamyshlybas', 'kapchagayskoye',
            'kara_bogaz_gol', 'karasor', 'kariba',
            'kasba', 'khanka', 'kinkony',
            'kisale', 'kivu', 'kokonor',
            'kossou', 'krasnoyarskoye', 'kremenchutska',
            'kubenskoye', 'kulundinskoye', 'kumskoye',
            'kuybyshevskoye', 'kyoga', 'ladoga',
            'lagdo', 'lagoa_dos_patos', 'langa_co',
            'langano', 'lano', 'leman',
            'lixiodain_co', 'lumajangdong_co', 'luotuo',
            'mai_ndombe', 'malawi', 'mangbeto',
            'manitoba', 'memar', 'michigan',
            'migriggyangzham', 'mingacevir', 'mono',
            'mossoul', 'mullet', 'mweru',
            'naivasha', 'namco', 'namngum',
            'nasser', 'nezahualcoyoti', 'ngangze',
            'ngoring_co', 'nicaragua', 'nipissing',
            'novosibirskoye', 'nueltin', 'oahe',
            'old_wives', 'onega', 'ontario',
            'opinac', 'peipus', 'prespa',
            'pukaki', 'pyaozero', 'ranco',
            'roseires', 'rukwa', 'rybinskoye',
            'saint_jean', 'sakakawea', 'saksak',
            'san_martin', 'saratovskoye', 'sarykamish',
            'sasykkol', 'saysan', 'segozerskoye',
            'serbug', 'sevan', 'shiroro',
            'sobradino', 'soungari', 'srisailam',
            'superior', 'swan', 'tana',
            'tanganika', 'tangra_yumco', 'tchad',
            'tchany', 'telashi', 'teletskoye',
            'telmen', 'tengiz', 'tharthar',
            'titicaca', 'todos_los_santos', 'toktogul',
            'tonle_sap', 'tres_marias', 'tsimlyanskoye',
            'tumba', 'turkana', 'ulan_ul',
            'ulungur', 'umbozero', 'uvs',
            'valencia', 'van', 'vanajanselka',
            'vanerm', 'vattern', 'victoria',
            'viedma', 'volta', 'walker',
            'williston', 'winnipeg', 'winnipegosis',
            'xiangyang', 'yamzho_yumco', 'yellowstone',
            'zeyskoye', 'zhari_namco', 'zhelin',
            'ziling', 'zimbambo', 'ziway',
            'zonag',
        ]

Search for data#

Having selected the correct dataset, we now need to specify what product type, variables, temporal and geographic coverage we are interested in. These parameters can all be selected in the “Download data” tab. In this tab, a form appears in which we can select the following parameters to download:

The dataset comes in a zip file, and may include several files. Only the most recent is necessary - this contains all of the data in previous files, plus the latest update.

At the end of the download form, select “Show API request”. This will reveal a block of code, which you can simply copy and paste into a cell of your Jupyter Notebook (see cell below). Having copied the API request into the cell below, running this will retrieve and download the data you requested into your local directory.

Warning

Please remember to accept the terms and conditions of the dataset, at the bottom of the CDS download form!

Download the data#

To download and visualise the timeseries, the name of the lake (lake_name) is required. The name of the lake must exist in the list of available lakes.

# the name of the lake
lake_name = 'Zhari-Namco'

A test is performed to verify that the water level of the selected lake is available.

if lake_name.lower().replace('-', '_') not in available_lakes :
    print (f'The lake "{lake_name}" is not still available')

Using the CDS api, this function retrieves the data for the specified lake into the defined output directory. The downloaded file is compressed (zip format) containing the data file in NetCDF format. To read the lake water level data, the file is extracted in the output directory.

c = cdsapi.Client()

list_regions = [ 'northern_africa', 'northern_north_america', 'southeastern_asia',
                 'southern_africa', 'southern_america', 'southern_north_america',
                 'southwestern_asia']

c.retrieve(
    'satellite-lake-water-level',
    {
        'variable': 'all',
        'region': list_regions,
        'lake': lake_name.lower().replace('-', '_'),
        'version': 'version_4_0',
        'format': 'zip',
    },
    f'{DATADIR}/{lake_name}.zip')

with zipfile.ZipFile(f'{DATADIR}/{lake_name}.zip') as z:
    z.extractall(f'{DATADIR}')
    
# recover the lake filename in netcdf format
nc_file = glob.glob(f'{DATADIR}/*{lake_name.upper().replace("_", "-")}*.nc')[0]

# open the file with Xarray
xr_lake = xr.open_dataset(nc_file)

2025-09-17 12:23:22,905 INFO [2025-09-03T00:00:00] To improve our C3S service, we need to hear from you! Please complete this very short [survey](https://confluence.ecmwf.int/x/E7uBEQ/). Thank you.
2025-09-17 12:23:22,905 INFO [2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.
2025-09-17 12:23:23,536 INFO Request ID is 02fa9da8-430e-4844-96ea-c566b5bf8140
2025-09-17 12:23:25,685 INFO status has been updated to accepted
2025-09-17 12:23:39,465 INFO status has been updated to running
2025-09-17 12:23:47,177 INFO status has been updated to successful

Visualise lake location#

It is always useful to check the location of the lake on a map. Many of the world’s lakes have the same name but may be located in different countries, as in the case of Lake Victoria: the best known is in Uganda, but there is also a Lake Victoria in Canada (48.35N, 57.11W) and another in Australia (34.04S, 141.28E). Or the name of the lake may be different, depending on the language: Caribou Lake or Reindeer Lake refer to the same lake (Canada). In some cases, even with the same language, the name of the lake may differ from one country to another: Lake Buenos Aires in Argentina and Lake General Carrera in Chile.

A world map showing the location of the lake is displayed. To generate this figure, we use the cartopy library.

fig = plt.figure(figsize = (12,6))
ax = plt.subplot(1,1,1, projection = crs.PlateCarree())

ax.coastlines()
ax.add_feature(feature.BORDERS, linewidth=0.5, edgecolor='blue')
ax.stock_img()

# add lake location 
lat_str = xr_lake.lake_barycentre_latitude.strip()
lat = float(lat_str[:-1])  # remove last character (N/S)
if lat_str.endswith('S'):
    lat = -lat
lon_str = xr_lake.lake_barycentre_longitude.strip()
lon = float(lon_str[:-1])  # remove last character (E/W)
if lon_str.endswith('W'):
    lon = -lon
    
print (f"Lake location (lat, lon): {lat}°, {lon}°")
plt.scatter(x = [lon], y = [lat], s = 30, color = 'red', transform = crs.PlateCarree())

Lake location (lat, lon): 30.9°, 85.6°

<matplotlib.collections.PathCollection at 0x13d06c4d0>

../../_images/f1177a1225966e8d084ebd7dab7e4f981002c9e7ba583934d6b4e1d85aa0299c.png

Visualise lake water level time series#

Once the data has been extracted, it can be read and visualised. Data in NetCDF format is read using the Xarray library, and Matplotlib is used to display time series.

The figure with the timeseries will be saved in the output directory in png format: [lake_name]_timeseries.png file.

The change in observation time step caused by changes in coverage by multiple missions can be seen in the figure. In recent years, Lake Baikal has been monitored by multiple missions and trajectories. As a result, the temporal resolution increases (the time between observations is reduced) and the time series is noisier due to the effect of the inter-mission and inter-tracks bias.

# plot and visualise the timeseries
fig = plt.figure(figsize = (12, 6))
xr_lake.water_surface_height_above_reference_datum.plot()
plt.title(f'Lake: {lake_name.title()}')

# save the figure
png_file = f'{DATADIR}/{lake_name}_timeseries.png'
plt.savefig(png_file, dpi=300, bbox_inches="tight")

../../_images/3b91ae96ff638ec0b6804c301047f87f0939f692d383dff562c4066930100baf.png

Visualise lake water level yearly anomalies#

Another way of analysing variations in water levels is to visualise anomalies in respect to average values over a reference period in the form of a bar chart, using red and blue colours to easily identify the year when levels are higher or lower than the climatic normal.

The same code can be used for the visualisation of yearly anomalies compared to a particular year.

The reference period can be adapted for each lake. The temporal coverage is not the same for all the lakes in the CDS, with the first available date ranging from 1992, as is the case for Lake Baikal, to 2019 for lakes monitored exclusively by the Sentinel-3B mission. In this example for Lake Baikal, we have chosen a reference period consisting of the first 10 years of data.

first_reference_year = 1992
last_reference_year  = 2002

The yearly anomaly is calculated using the following steps:

Applying xarray’s groupby() function to group the data by year, and
Calculating the mean value for each annual group.

The average value of the water level in the selected lake is estimated as the average water level over the reference period.

yearly_mean = xr_lake.groupby('time.year').mean(keep_attrs = True)

ref = yearly_mean.where((yearly_mean.year >= first_reference_year ) & (yearly_mean.year <= last_reference_year ))
ref_lwl_mean = np.nanmean(ref.water_surface_height_above_reference_datum.values)

print (f'Mean LWL value during the reference period: {np.round(ref_lwl_mean,3)} (m)')

Mean LWL value during the reference period: 4610.444 (m)

The yearly anomaly is shown in red for the annual values above the average water level during the reference period and in blue for the annual values below.

The figure with the anomalies is saved in the output directory in png format : [lake_name]_yearly-anomalies.png file.

year_mean = yearly_mean.to_dataframe()
year_mean['lwl_yearly_anomaly'] = year_mean['water_surface_height_above_reference_datum'] - ref_lwl_mean
year_mean['positive_anomaly'] = year_mean['lwl_yearly_anomaly'] > 0

fig, ax = plt.subplots(figsize=(12, 6))
year_mean['lwl_yearly_anomaly'].plot(kind='bar', 
                                     color = year_mean.positive_anomaly.map ({True: 'red', False: 'blue'}),
                                     xlabel = '\nYear',
                                     ylabel = 'Lake Water Level Anomaly (m)',
                                     title = f'Yearly water level anomalies on Lake {lake_name.title().replace("_", "-")}\n reference period:  {first_reference_year} - {last_reference_year}')

ax.grid()

# save the figure
png_file = f'{DATADIR}/{lake_name}_yearly-anomalies.png'
plt.savefig(png_file, dpi=300, bbox_inches='tight')

../../_images/e2ce6303532824f5d730b7ba70e809533d1280027fe33f827592f034d6da6d4c.png