Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

4. Grid Formats and Regridding

Learning objectives

By the end of this notebook you will be able to:

  • Describe ERA5’s native N320 reduced Gaussian grid and the values dimension

  • Extract a spatial sub-area from a FieldList using lat/lon masks

  • Regrid a FieldList to a regular lat-lon grid with earthkit.geo

  • Write native and regridded fields to Zarr

This notebook assumes familiarity with FieldLists. If you haven’t done notebook 1 yet, start there.


ERA5’s native grid

ERA5 is produced on the N320 reduced Gaussian grid:

  • 640 latitude rows at Gauss-Legendre positions (not evenly spaced)

  • Reduced — fewer longitude points per row near the poles, keeping physical spacing roughly uniform

  • ~542,000 grid points globally, ~31 km average spacing

When ERA5 GRIB is loaded natively — without requesting server-side regridding — xarray represents the field with a values dimension: a flat 1-D array over all grid points. Latitude and longitude are coordinate arrays attached per point, not independent axes.

Dimensions:  values=542080
Coordinates: latitude (values), longitude (values)
Data vars:   2t (values)

This is the canonical form for N320 data. The familiar 2-D (latitude, longitude) array only appears after regridding to a regular grid — either server-side (CDS grid= parameter) or locally with earthkit-geo.

When to regrid: Only when downstream processing genuinely requires a 2-D rectangular array. Regrid as late as possible — you can always regrid, but you cannot recover lost sub-grid detail.

Setup

import earthkit.data as ekd
from earthkit.geo import regrid
import earthkit.plots as ekp
import os

ekd.settings.set({"cache-policy": "user"})
os.makedirs("data", exist_ok=True)
print("Setup complete")

Load ERA5 as a FieldList

To get ERA5 on its native N320 grid, omit the grid parameter in the CDS request. Adding grid= triggers server-side regridding before the data reaches you.

# DATA: Native N320 reduced Gaussian ERA5 — request without grid= parameter:
#
# fl = ekd.from_source(
#     "cds",
#     "reanalysis-era5-single-levels",
#     request=dict(
#         variable=["2m_temperature", 'msl'],
#         product_type="reanalysis",
#         date="2020-01-01",
#         time="12:00",
#         # no grid= -> native N320
#         grid = 'N320',
#         format="grib",
#     ),
# ).to_fieldlist()
#
# For a pre-downloaded file:
# fl = ekd.from_source("file", "/path/to/era5_native.grib")

# We use the global 0.25° monthly-mean sample here.
# The FieldList API and regrid calls are identical for N320 data.
# DATA: era5-monthly-mean-2t-199312.grib — global 0.25°, December 1993
fl = ekd.from_source("sample", "era5-N320-2t-msl-20200101.grib").to_fieldlist()

print(f"Fields: {len(fl)}")
f0 = fl[0]
f0.describe('geography')

Sub-area extraction

The cleanest approach is to specify area in the CDS request — the server returns only the points you need, minimising download size.

For post-load extraction, build a mask from the field’s lat/lon coordinates.

# Request-time (preferred):
#
# fl_europe = ekd.from_source(
#     "cds",
#     "reanalysis-era5-single-levels",
#     request=dict(
#         variable=["2m_temperature", 'msl'],
#         product_type="reanalysis",
#         date="2020-01-01",
#         time="12:00",
#         area=[72, -25, 30, 45],   # N, W, S, E
#         format="grib",
#     ),
# ).to_fieldlist()
# fl_europe[0].describe('geography')
# Post-load: build a mask from lat/lon coordinates
# latlons(flatten=True) returns (lats, lons) as flat 1-D arrays — works for any grid type
lats, lons = f0.geography.latlons(flatten=True)

# Europe: 30–72°N, 25°W–45°E
mask = (lats >= 30) & (lats <= 72) & (lons >= -25) & (lons <= 45)

print(f"Global  : {len(lats):,} grid points")
print(f"Europe  : {int(mask.sum()):,} grid points  ({100*mask.mean():.1f}% of global)")

Regridding to regular lat-lon

Use earthkit.geo.regrid() to interpolate a FieldList to a new grid. earthkit-geo uses mir as the backend to calculate the matrices on the fly, caching for later use.

# Regrid operates directly on the FieldList and returns a new FieldList
fl_1deg = regrid(fl, grid={"grid": [1, 1]})

f_1deg = fl_1deg[0]
print(f"Original  : gridType={f0.metadata('gridType')}  values={f0.values.shape}")
print(f"Regridded : gridType={f_1deg.metadata('gridType')}  "
      f"Ni={f_1deg.metadata('Ni')}  Nj={f_1deg.metadata('Nj')}")
print(f"Original  shape: {fl[0].shape}")
print(f"Regridded shape: {fl_1deg[0].shape}")
fig = ekp.Figure(1, 2, figsize=(8, 6), domain = 'Europe')

for i, d in enumerate([fl, fl_1deg]):
    fig.add_map(0,i).grid_cells(d, style='auto')

fig.coastlines()
fig.title("Original (N320) vs Regridded (1°) ERA5")
fig.legend()
fig.show()
    
# Extract Europe and write regridded store to Zarr
xr_europe_1deg = f_1deg.to_xarray().sel(
    latitude=slice(72, 30),
    longitude=slice(-25, 45),
)
xr_europe_1deg.to_zarr("data/era5_europe_regridded.zarr", mode="w")
print(f"Regridded Europe: {dict(xr_europe_1deg.sizes)}")
print("Written to data/era5_europe_regridded.zarr")

Grid format summary

FormatDimensionGlobal pointsNotes
N320 reduced Gaussianvalues (1-D)~542,000Native ERA5; no interpolation error
Regular lat-lon 1°(lat, lon) (2-D)181 × 360Standard regridded output
Regular lat-lon 0.25°(lat, lon) (2-D)721 × 1,440CDS default with grid=[0.25,0.25]

Rule of thumb: stay native until something downstream forces a regrid.


Summary

You have:

  • Understood the N320 reduced Gaussian grid and values dimension

  • Extracted a Europe sub-area using a lat/lon mask

  • Regridded a FieldList to 1° regular lat-lon with regrid(fl, grid={"grid": [1, 1]})

  • Written native and regridded Zarr stores


Activity

  1. Regrid to 2°. How does the grid point count compare to 1°?

  2. Apply the Europe lat/lon mask directly to the N320 field. How many points fall in Europe? Compare to the 1° grid Europe store.

  3. Modify the CDS request comment to request wind fields (u, v) in addition to temperature. How would you then select each variable from the returned FieldList?

fl_2deg = regrid(fl, grid={"grid": [2, 2]}, method="linear")
xr_2deg = fl_2deg.to_xarray()
print(xr_2deg["2t"].shape)