Comparing satellite data with models#

Henk Eskes (henk.eskes@knmi.nl)
September 2024, update September 2025

Comparing models and observations

In this practical we will investigate how a model (for instance a regional air-quality forecasting model) can be compared with satellite observations (retrievals) of atmospheric composition. Atmospheric composition information is provided by several satellite instruments, and can be derived from the UV, visible, near-infrared, short-wave infrared, infrared or microwave parts of the spectrum. Radiance measurements from these instruments are converted into retrieval “Level-2” products. These are either reported as profiles (e.g. an ozone profile from the nadir-looking TROPOMI, or from the microwave limb sounder MLS), or, in many case, as a column amount (e.g. for CO, CH4, CO2, NO2, SO2, HCHO, NH3 and several other species), or as optical property (e.g. AOD - Aerosol optical depth).

Here we will focus on trace gas observations, columns and profiles.

Models minus observation departures

The difference between a measurement and the model-simulation of the measurement is a key quantity in data assimilation and validation, and is often called the departure:

Note: This has the unit of the observations, e.g. ppb, µg/m3, mol/m2.

Relative differences do not have a unit and are often easier to understand and interpret for non-specialist
.
These can be expressed as percentage difference (%).

A more symmetric form is also possible (useful when some observations become close to, or equal to 0):

In the CAMS validation reports most results are expressed in relative units,
such as the normalised bias, fractional gross error (a normalised absolute mean difference) or the correlation.

Model output is in most cases four dimensional (4D), e.g. 3D fields produced every hour, written as a 4D matrix

while the observation y is taken at a given point in space and time.
Therefore we need an “observation operator” H, to computing the model equivalent of the observation at the location and time of the observation.
The departure becomes:

In it’s simplest form H is an interpolation in time, followed by a three-dimensional interpolation in space

For atmospheric composition remote sensing (with satellites or with ground instruments) the retrieval product 
is often a profile provided on a vertical retrieval grid.
For instance, the Sentinel-5P satellite mission (TROPOMI instrument) has an ozone profile product, specified on a number of vertical levels.

The 3D interpolation operator can then be written as a successive horizontal lat-lon interpolation followed by a 
vertical interpolation from the model levels to the retrieval grid:

Comparing models and remote-sensing retrievals

It seems natural to use the above form of the observation operator (based on space-time interpolations) in comparisons against satellite profiles, 
but this is not the optimal way to compare model output with remote sensing observations.

Why not?

  1. Remote sensing observations can not retrieve concentrations at one point (vertical level), 
but rather provide height-integrated averages of concentrations weighted by 
sensitivity profiles. Comparing with model ozone point values is like comparing ozone apples with ozone pears.

  2. The number of independent pieces of information in the measurements is often much less 
that the number of retrieval layers. In this case the retrieval profile shapes are constrained 
or “smoothed” by a-priori profile information (e.g. fixed profile shape or climatology), 
and the comparisons will partly reflect the a-priori input.

Optimal Estimation Theory

The bible of remote sensing retrieval theory is the book ”Inverse methods for atmospheric sounding” by Clive Rodgers World Scientific, 2000.

The concepts which are introduced briefly below are derived and discussed in detail in this book. It is a must read for people working on (satellite) retrievals.

Comparing models and remote-sensing retrievals (continued)

Instead of comparing with retrievals (trace gas columns or profiles) we can use the model and observation operator to simulate radiances or reflectances (as function of wavelength) and compare directly with the satellite observations of radiation:

Where F is a forward radiative transfer operator which computes wavelength-dependent 
radiances given a model trace gas or aerosol vertical profile (and other quantities 
that determine the radiances).

As shortcut we wrote the profile at the satellite location as:

Notes:

  • A major advantage of the radiance comparison approach: There is no additional a-priori profile shape dependence in this comparison, only the profile of the model is needed. Retrieval methods generally depend on a pre-defined a-priori profile shape (for instance a climatology). In contrast, a forward radiative transfer model does not.

  • In numerical weather prediction (e.g. at ECMWF) the assimilation of satellite radiances is common (default) practice, for instance to assimilate satellite temperature profile information.

  • In contrast, for the global 
daily forecasts of trace gases and aerosol, CAMS is mainly assimilating retrieval products such as ozone stratospheric profiles and total columns, CO, NO2 columns, retrieved aerosol optical depth (AOD). (Some experiments are conducted in CAMS to assimilate radiances, for instance to constrain aerosol properties.)

  • For validation and interpretation the radiance assimilation is not always practical: differences in radiances are not so easily related 
to errors in trace gas concentrations. Also: a complex forward model to compute the radiances has to be 
implemented in the comparison.

We have now discussed two options:

  1. Compare retrieved profiles with a model using a 4D interpolation observation operator.

  2. Comparing measured satellite radiances with radiances simulated with the model and a radiative transfer model.

But there is a third option:

Instead of comparing with retrievals products we can also reproduce the retrieval. 
This means:

  • first simulate radiances (as function of wavelength) and then

  • apply the retrieval operator with the a-priori profile and other parameters used for the retrieval.

This may sounds very complicated, but it is not, and can be implemented using information provided in the L2 satellite data products, as explained below.

The model-observation difference now becomes:

Where F is a forward radiative transfer operator computing radiances based on the model profile and input parameters.
R is the retrieval method which depends on:

  • The radiances or reflectances r

  • The a-priori profile xa. In this tutorial these are trace gas profiles, for instance ozone or NO2.

  • Other retrieval input parameters b.
    
Examples of “b” are surface albedo, cloud properties, water vapour, temperature profile

Forward and retrieval operators are typically non-linear, but can be linearised around the a-priori,
 see e.g. the book of Rodgers (2000):

Where F is a matrix converting from real space to wavelength space,
 and R is the linearised retrieval matrix connecting wavelength space to real space.

The model-simulated retrieval

The product of the linearised retrieval method times forward model

is the Averaging Kernel of the retrieval (see Rodgers, 2000).

Excercise: derive this relation using the formulas above.

The departure now becomes:

where y is the retrieved quantity (for instance an ozone profile, a vector quantity).

The averaging kernel and a-priori profile are normally included in the retrieval dataproduct.

Therefore this method 3 comparison can be 
done without additional radiative transfer calculations.

Note:

In the discussion so far the averaging kernel is normally a square matrix, N x N, where N is the number of vertical layers used for the computations.
However, this is not needed: the forward model may use a different set of vertical layers 
than the retrieval method.

For an accurate forward modelling of the radiation a sufficient number of layers is needed. Typically 20-40 layers are used.

But the number of independent pieces of information derived from the satellite data is often (much) smaller than the number of layers. For instance, the ozone profile retrieval for TROPOMI has about 6 pieces of information on ozone. Therefore the results could be provided for a limited number of layers or subcolumns.

The extreme example is the retrieval of total columns (like for TROPOMI NO2), where the retrieval produces 
just one number instead of a profile.
The averaging kernel then becomes a vector
.

Typically remote sensing retrievals are very sensitive to the a-priori profile shape 
provided to the retrieval. However, the averaging kernel depends in the same way 
on the a-priori, because it contains the retrieval method. While absolute comparisons 
still depend on the a-priori, the relative comparison

becomes a-priori independent when the averaging kernel is used in the comparison. Note that this holds for weak absorbers with a linear relation between the absorber and the radiance (such as for NO2 to a good approximation). For strong absorbers the forward model becomes non-linear and has to be linearised around an a-priori starting point. This introduces a (often weak) dependence of the comparison on the linearisation point.

There is also a high-level equivalence between comparisons in radiance space and 
in profile space. When an optimised profile reproduces the radiances, eq. (2),
 then the same profile will give a perfect match with the retrieval, 
independent of the a-priori chosen in the retrieval method.

But:

  • The forward and retrieval operators should be reasonably linear around the a-priori.

  • The retrieval method should not be too simple and should not loose information.

Reference: Stefano Migliorini, “On the Equivalence between Radiance and Retrieval Assimilation”, MWR 2012.

Satellite retrievals of columns

Several column products, like TROPOMI SO2, HCHO and NO2 total columns, are derived using the Differential Optical Absorption Spectroscopy method. This two-step approach first compares differential spectral features in the measurement with reference absorption spectra, to derive the amount of absorption along the light path. Secondly, a so-called air-mass factor is derived using radiative transfer calculations to compute the vertical column.

The DOAS column retrieval approach has been re-formulated using Rodgers Optimal Estimation formalism (Eskes and Boersma, 2003). This leads naturally to the definition of averaging kernel vectors for total column DOAS retrieved quantities.

NO2 retrieval example: The DOAS averaging kernel profiles are a measure of the sensitivity of the measurement to NO2 at a given altitude. NO2 is retrieved in the 400-500 nm spectral range, where Raileigh scattering Each retrieval has its own kernel, depending on retrieval parameters like the surface reflectivity, cloud cover, geometry (solar zenith and satellite viewing angles). The image below shows three examples:

  • When the surface is dark the sensitivity close to the surface (and kernel value) is small.

  • When the surface is bright (e.g. sand, snow) the sensitivity close to the surface is large.

  • Clouds obscure the NO2 close to the surface. The satellite has almost no sensitivity below the cloud, but is more sensitive just above the cloud.

For the weak absorber total column retrieval (e.g. TROPOMI NO2) the formula to compare model with the retrieval simplifies further. For this retrieval

and therefore the (model-observation) departure becomes

or

The retrieved column (a single number) is compared with the dot product of the kernel vector and the vector of modelled partial columns for all the vertical layers.

Note that the a-priori profile is not needed to compare the retrieval with the model.
This is the reason why the a-priori is not provided in the NO2 data product.

Error sources in the model-satellite comparison

The spread in (model-obs) values is determined by three error terms:

  • Model (forecast) errors

  • Combined observation + retrieval errors

  • Representativity errors

In this formula we assume that the three error terms are uncorrelated.

The satellite retrieval provides an error covariance matrix quantifying the retrieval errors. This is a matrix: because of the limited vertical resolution the errors in the retrieved values will be correlated between layers at different altitudes. If the retrieved ozone has a positive error at a certain altitude, say 20 km, then it is likely that the error at 21 km is also positive. This is described by the off-diagonal terms in the retrieval covariance matrix. Layers further apart may be anti-correlated: a positive error at 20 km may on average be related to a negative error at 40 km altitude.

Representativity errors are often neglected but are also often the largest term, especially for strongly varying short-lived species!
Examples:

  • Comparing a roadside NO2 / PM monitor with the 10x10 km gridcell of a European air quality model

  • Difference in satellite footprint and model gridcell (different air masses)

  • Measurements in mountainous terrain

If kernels are not used in the comparison, then a so-called “smoothing error” (Rodgers, 2000) has to be added to the error budget to reflect the finite (vertical) resolution of the retrieval.

Take-home messages

  • Retrievals (profiles, columns) are often strongly dependent on the a-priori used in the retrieval method.

  • Remote sensing retrievals can be compared with models in two ways: (1) by using only the 4D interpolation form of the observartion operator, or (2) by applying the averaging kernels after the 4D interpolation.

  • Method (2) is highly recommended. By using the averaging kernels the relative comparison between model and retrieval becomes independent (or only weakly dependent in the non-linear case) on the a-priori assumptions made in the retrieval. In general (model-measurement) differences will be smaller with method 1 because the smoothing error is avoided.