CAMS Global Atmospheric Composition Forecast Practical#

Run the tutorial via free cloud platforms: binder kaggle colab

Learning objectives#

  • In this practical exercise you will learn how to download CAMS Global Atmospheric Composition Forecast data programmatically using the Application Programming Interface (API) of the Atmosphere Data Store (ADS).

  • You will also learn how to read the data into a Python object and explore its characteristics, including data dimensions, units, etc.

  • You will then visualise static and animated maps of forecast data for CAMS Total Aerosol Optical Depth at 550nm showing transport of Saharan dust and Canadian wildfire smoke across the North Atlantic Ocean between 25 May and 15 June 2025.

  • Finally, you will plot a time series of the same data, but for a specific grid cell over Payerne, Switzerland.

This practical session is based on real events that are reported in a news article on the Boreal summer 2025 and in an article on Canadian wildfires in early June 2025 and smoke transport to Europe.

Initial setup#

Before we begin we must prepare our environment. This includes installing the Application Programming Interface (API) of the Atmosphere Data Store (ADS), intalling any other packages not already installed, setting up our ADS API credentials and importing the various Python libraries that we will need.

# Ensure that the cdsapi package is installed
!pip install -q cdsapi
# If you are running this notebook in Colab, uncomment the line below and run this cell.
#!pip install cartopy

Add your ADS API credentials#

To set up your ADS API credentials, please login/register on the ADS, then you will see your unique API key here.

You can add this API key to your current session by replacing ######### in the code cell below with your API key.

import os
os.environ['CDSAPI_URL'] = 'https://ads.atmosphere.copernicus.eu/api'
os.environ['CDSAPI_KEY'] = '#############################################'

Import libraries#

# CDS API
import cdsapi

# Libraries for working with multidimensional arrays
import numpy as np
import xarray as xr
import pandas as pd
import datetime as dt

# Libraries to assist with animation and visualisations
%matplotlib inline
import matplotlib.pyplot as plt
from matplotlib import animation
import cartopy.crs as ccrs
from IPython.display import HTML

# Disable warnings for data download via API
import urllib3 
urllib3.disable_warnings()

Here we specify a data directory in which we will download our data and all output files that we will generate:

DATADIR = '.'

Explore and download data#

Visit the download form for the CAMS global forecast data https://ads.atmosphere.copernicus.eu/datasets/cams-global-atmospheric-composition-forecasts?tab=download. View the parameters in the API script in the following cell and select the corresponding options.

At the end of the download form, select “Show API request”. This will reveal a block of code, which should be identical to the code cell below.

Please remember to accept the terms and conditions at the bottom of the download form.

Download data#

With the API request copied into the cell below, running this cell will retrieve and download the data you requested into your local directory.

dataset = "cams-global-atmospheric-composition-forecasts"
request = {
    'variable': ['total_aerosol_optical_depth_550nm',
                 'dust_aerosol_optical_depth_550nm',
                 'organic_matter_aerosol_optical_depth_550nm'],
    'date': ['2025-05-25/2025-06-15'],
    'time': ['00:00'],
    'leadtime_hour': ["3", "6", "9", "12", "15", "18", "21", "24"],
    'type': ['forecast'],
    'data_format': 'netcdf'
}

client = cdsapi.Client()
client.retrieve(dataset, request).download(f'{DATADIR}/total-aod-550nm-global-forecast-jun2025.nc')

Inspect data#

# Path to the downloaded file
netcdf_file = f'{DATADIR}/total-aod-550nm-global-forecast-jun2025.nc'

# Create Xarray Dataset
ds = xr.open_dataset(netcdf_file)

# view the dataset
ds
# Define single time dimension
stacked_ds = ds.stack(datetime=("forecast_reference_time", "forecast_period"))
stacked_ds = (
    stacked_ds.drop_vars("datetime")
    .rename_dims({"datetime": "time"})
    .rename_vars({"valid_time": "time"})
)

ds = stacked_ds.set_index(time="time")
ds
# create xarray data array object (single variable)
da = ds['aod550']
da

Plot global map of forecast#

Define time step#

time_step = 0
lead_times = da['time']
forecast_day = str(lead_times[time_step].to_numpy())[:10]
forecast_day

Plot map for given time step#

# create the figure panel and specify size
fig = plt.figure(figsize=(15, 10))

# create the map using the cartopy PlateCarree projection
ax = plt.subplot(1,1,1, projection=ccrs.PlateCarree())

# Add lat/lon grid
ax.gridlines(draw_labels=True, linewidth=1, color='gray', alpha=0.5, linestyle='--')

# Set figure title
ax.set_title(f'AOD at 550nm, {forecast_day}', fontsize=12) 

# Plot the data
im = plt.pcolormesh(da.longitude, da.latitude, da[:,:,time_step], cmap='YlOrRd', vmin=0, vmax=1) 

# Add coastlines
ax.coastlines(color='black') 

# Specify the colourbar, including fraction of original axes to use for colorbar, 
# and fraction of original axes between colorbar and new image axes
cbar = plt.colorbar(im, fraction=0.025, pad=0.05) 

# Define the colourbar label
cbar.set_label('AOD at 550nm') 

# Save the figure
fig.savefig(f'{DATADIR}/aod-550nm-global.png')

Plot animation of all forecast steps#

# For convenience, resample as daily mean data
da_mean = ds['aod550'].resample(time='D').mean()

Create initial state#

fig = plt.figure(figsize=(10, 5)) # Define the figure and specify size
ax = plt.subplot(1,1,1, projection=ccrs.PlateCarree()) # Specify plot area & projection
ax.gridlines(draw_labels=True, linewidth=1, color='gray', alpha=0.5, linestyle='--') # Add lat/lon grid
ax.set_title(f'AOD at 550nm, {str(da_mean.time[0].values)[:-16]}', fontsize=12) # Set figure title
ax.coastlines(color='black') # Add coastlines
im = plt.pcolormesh(da_mean.longitude, da_mean.latitude, da_mean[:,:,0], cmap='YlOrRd', vmin=0, vmax=1) # Plot the data
cbar = plt.colorbar(im,fraction=0.046, pad=0.04) # Specify the colourbar
cbar.set_label('AOD at 550nm') # Define the colourbar label

Set number of frames#

frames = 23

Create a function that will be called by the animation object#

def animate(i):
    array = da_mean[:,:,i].values
    im.set_array(array.flatten())
    ax.set_title(f'AOD at 550nm, {str(da_mean.time[i].values)[:-16]}', fontsize=12)

Create animation object#

ani = animation.FuncAnimation(fig, animate, frames, interval=150)

Display animation#

HTML(ani.to_jshtml())

Plot time series for given latitude and longitude#

Convert longitude to [-180, 180] grid#

Notice that the longitude variables in the Xarray Dataset and Data Array objects are in the range of [0, 359.75]. By default, ECMWF data are on a [0, 360] grid. Should you wish to, there are two options to bring the longitude coordinates to a [-180, 180] grid. The first option, in case you already have the data downloaded, is to assign values to coordinates with the xarray function assign_coords(). The code below shifts your longitude coordinates from [0, 359.75] to [-180, 179.75].

The second option is to specify the area keyword argument right when you request data with the CDS API. The area keyword then automatically reprojects the requested data onto a [-180, 180] grid.

ds_180 = ds.assign_coords(longitude=(((ds.longitude + 180) % 360) - 180)).sortby('longitude')
da = ds_180['aod550']
da

Select location#

payerne_lat = 46.81
payerne_lon = 6.94
payerne_da = da.sel(latitude = payerne_lat, longitude = payerne_lon, method='nearest')
payerne_da

Plot forecast output at Payerne

fig = plt.figure(figsize=(10, 5))
payerne_da.plot(marker='o')
plt.suptitle("AOD at 550nm, Payerne")
plt.grid(True)

Read Aeronet data for Payerne#

# Direct sun data
file_ds=f"20250601_20250630_Payerne.lev15"
ds_data=pd.read_csv(file_ds,skiprows=6, na_values=[-999])
ds_data['date_time'] = pd.to_datetime(ds_data['Date(dd:mm:yyyy)'] + ' ' + ds_data['Time(hh:mm:ss)'], format='%d:%m:%Y %H:%M:%S')
cols_ds = ['date_time', 'AOD_1020nm', 'AOD_870nm', 'AOD_675nm', 
           'AOD_500nm', 'AOD_440nm','AOD_380nm','440-870_Angstrom_Exponent']
sub_ds = ds_data[cols_ds].copy()
sub_ds=(sub_ds
        .groupby(pd.Grouper(key='date_time', freq='h'))
        .mean()
        .reset_index())
merged_data = sub_ds
merged_data = merged_data.set_index('date_time')
merged_data

Plot forecast and Aeronet data#

fig = plt.figure(figsize=(10, 5))
payerne_da.plot(marker='o')
merged_data['AOD_500nm'].plot(marker='o', color='purple')
plt.suptitle("AOD at 550nm, Payerne")
plt.grid(True)

What is the origin of the increased AOD values over Payerne?#

da_omaod = ds['omaod550']
da_duaod = ds['duaod550']

payerne_om = da_omaod.sel(latitude = payerne_lat, longitude = payerne_lon, method='nearest')
payerne_du = da_duaod.sel(latitude = payerne_lat, longitude = payerne_lon, method='nearest')

fig = plt.figure(figsize=(10, 5))
payerne_da.plot(marker='o')
payerne_om.plot(marker='o', color='magenta')
payerne_du.plot(marker='o', color='orange')
merged_data['AOD_500nm'].plot(marker='o', color='purple')
plt.suptitle("AOD at 550nm, Payerne")
plt.grid(True)