Learning objectives
By the end of this notebook you will be able to:
Describe ERA5’s native N320 reduced Gaussian grid and the
valuesdimensionExtract a spatial sub-area from a FieldList using lat/lon masks
Regrid a FieldList to a regular lat-lon grid with
earthkit.geoWrite native and regridded fields to Zarr
This notebook assumes familiarity with FieldLists. If you haven’t done notebook 1 yet, start there.
ERA5’s native grid¶
ERA5 is produced on the N320 reduced Gaussian grid:
640 latitude rows at Gauss-Legendre positions (not evenly spaced)
Reduced — fewer longitude points per row near the poles, keeping physical spacing roughly uniform
~542,000 grid points globally, ~31 km average spacing
When ERA5 GRIB is loaded natively — without requesting server-side regridding — xarray represents the field with a values dimension: a flat 1-D array over all grid points. Latitude and longitude are coordinate arrays attached per point, not independent axes.
Dimensions: values=542080
Coordinates: latitude (values), longitude (values)
Data vars: 2t (values)This is the canonical form for N320 data. The familiar 2-D (latitude, longitude) array only appears after regridding to a regular grid — either server-side (CDS grid= parameter) or locally with earthkit-geo.
When to regrid: Only when downstream processing genuinely requires a 2-D rectangular array. Regrid as late as possible — you can always regrid, but you cannot recover lost sub-grid detail.
Setup¶
import earthkit.data as ekd
from earthkit.geo import regrid
import earthkit.plots as ekp
import os
ekd.settings.set({"cache-policy": "user"})
os.makedirs("data", exist_ok=True)
print("Setup complete")Load ERA5 as a FieldList¶
To get ERA5 on its native N320 grid, omit the grid parameter in the CDS request. Adding grid= triggers server-side regridding before the data reaches you.
# DATA: Native N320 reduced Gaussian ERA5 — request without grid= parameter:
#
# fl = ekd.from_source(
# "cds",
# "reanalysis-era5-single-levels",
# request=dict(
# variable=["2m_temperature", 'msl'],
# product_type="reanalysis",
# date="2020-01-01",
# time="12:00",
# # no grid= -> native N320
# grid = 'N320',
# format="grib",
# ),
# ).to_fieldlist()
#
# For a pre-downloaded file:
# fl = ekd.from_source("file", "/path/to/era5_native.grib")
# We use the global 0.25° monthly-mean sample here.
# The FieldList API and regrid calls are identical for N320 data.
# DATA: era5-monthly-mean-2t-199312.grib — global 0.25°, December 1993
fl = ekd.from_source("sample", "era5-N320-2t-msl-20200101.grib").to_fieldlist()
print(f"Fields: {len(fl)}")
f0 = fl[0]
f0.describe('geography')Sub-area extraction¶
The cleanest approach is to specify area in the CDS request — the server returns only the points you need, minimising download size.
For post-load extraction, build a mask from the field’s lat/lon coordinates.
# Request-time (preferred):
#
# fl_europe = ekd.from_source(
# "cds",
# "reanalysis-era5-single-levels",
# request=dict(
# variable=["2m_temperature", 'msl'],
# product_type="reanalysis",
# date="2020-01-01",
# time="12:00",
# area=[72, -25, 30, 45], # N, W, S, E
# format="grib",
# ),
# ).to_fieldlist()
# fl_europe[0].describe('geography')# Post-load: build a mask from lat/lon coordinates
# latlons(flatten=True) returns (lats, lons) as flat 1-D arrays — works for any grid type
lats, lons = f0.geography.latlons(flatten=True)
# Europe: 30–72°N, 25°W–45°E
mask = (lats >= 30) & (lats <= 72) & (lons >= -25) & (lons <= 45)
print(f"Global : {len(lats):,} grid points")
print(f"Europe : {int(mask.sum()):,} grid points ({100*mask.mean():.1f}% of global)")Regridding to regular lat-lon¶
Use earthkit.geo.regrid() to interpolate a FieldList to a new grid. earthkit-geo uses mir as the backend to calculate the matrices on the fly, caching for later use.
# Regrid operates directly on the FieldList and returns a new FieldList
fl_1deg = regrid(fl, grid={"grid": [1, 1]})
f_1deg = fl_1deg[0]
print(f"Original : gridType={f0.metadata('gridType')} values={f0.values.shape}")
print(f"Regridded : gridType={f_1deg.metadata('gridType')} "
f"Ni={f_1deg.metadata('Ni')} Nj={f_1deg.metadata('Nj')}")print(f"Original shape: {fl[0].shape}")
print(f"Regridded shape: {fl_1deg[0].shape}")fig = ekp.Figure(1, 2, figsize=(8, 6), domain = 'Europe')
for i, d in enumerate([fl, fl_1deg]):
fig.add_map(0,i).grid_cells(d, style='auto')
fig.coastlines()
fig.title("Original (N320) vs Regridded (1°) ERA5")
fig.legend()
fig.show()
# Extract Europe and write regridded store to Zarr
xr_europe_1deg = f_1deg.to_xarray().sel(
latitude=slice(72, 30),
longitude=slice(-25, 45),
)
xr_europe_1deg.to_zarr("data/era5_europe_regridded.zarr", mode="w")
print(f"Regridded Europe: {dict(xr_europe_1deg.sizes)}")
print("Written to data/era5_europe_regridded.zarr")Grid format summary¶
| Format | Dimension | Global points | Notes |
|---|---|---|---|
| N320 reduced Gaussian | values (1-D) | ~542,000 | Native ERA5; no interpolation error |
| Regular lat-lon 1° | (lat, lon) (2-D) | 181 × 360 | Standard regridded output |
| Regular lat-lon 0.25° | (lat, lon) (2-D) | 721 × 1,440 | CDS default with grid=[0.25,0.25] |
Rule of thumb: stay native until something downstream forces a regrid.
Summary¶
You have:
Understood the N320 reduced Gaussian grid and
valuesdimensionExtracted a Europe sub-area using a lat/lon mask
Regridded a FieldList to 1° regular lat-lon with
regrid(fl, grid={"grid": [1, 1]})Written native and regridded Zarr stores
Activity
Regrid to 2°. How does the grid point count compare to 1°?
Apply the Europe lat/lon mask directly to the N320 field. How many points fall in Europe? Compare to the 1° grid Europe store.
Modify the CDS request comment to request wind fields (u, v) in addition to temperature. How would you then select each variable from the returned FieldList?
fl_2deg = regrid(fl, grid={"grid": [2, 2]}, method="linear") xr_2deg = fl_2deg.to_xarray() print(xr_2deg["2t"].shape)