In this notebook you will see how to:
inspect GRIB data
work with fields and fieldlists
modify fields
convert GRIB to Xarray
convert GRIB to Pandas
Getting the data¶
First we read some GRIB data from disk with from_source(). The returned object provides some basic information but its primary goal is to convert the data into the required representation for further work. The actual data loading is deferred, as much as possible, until the data is converted into a given type.
import earthkit.data as ekd
d = ekd.from_source("file", "test4.grib")
dd.available_typesFieldlists and fields¶
GRIB data can be converted into a FieldList, where each field represents a GRIB message. In earthkit a field is a horizontal slice of the atmosphere at a given time. In this sense the Field object is generic enough to be able to represent other types of data than GRIB (e.g. NetCDF, GeoTIFF, dictionary data etc.)
# fl is a fieldlist
fl = d.to_fieldlist()len(fl)# ls() lists the fields in the fieldlist
fl.ls()# we can iterate through the fields
for f in fl:
print(f)Fields¶
# f is the first field in the fieldlist
f = fl[0]A Field is composed of format independent components, each containing a set of metadata keys. The most important components and keys are as follows:
parameter:
parameter.variable
parameter.units
parameter.chem_variable
time:
time.base_datetime
time.valid_datetime
time.step
vertical:
vertical.level
vertical.level_type
geography:
geography.latitudes
geography.longitudes
geography.shape
ensemble:
ensemble.member
There are two ways to access the related values from the components:
# use the get() method
f.get("time.base_datetime")# use the key method on the component
f.time.base_datetime()The components contain format independent metadata. This is most obvious when looking at the level type or the units.
# the GRIB metadata would be isobaricInhPa
f.vertical.level_type()# the GRIB metadata would be K
f.parameter.units()To get an overview about the components/keys of a field simply use the automatic display.
# this is the same as f.describe()
fField values¶
f.to_numpy()Field geography¶
# the shape is based on the geography (when available)
f.shapeWe can use latlons() to get a tuple of the latitudes and longitudes arrays. Please note “geography.latlons” cannot be used as a key.
f.geography.latlons()Modifying fields¶
# set() generates a new field with updated values/components
vals = f.values + 2.0
f1 = f.set({"vertical.level": 700, "values": vals})
f1.ls()# compare the metadata in the old and new fields
f.get("vertical.level"), f1.get("vertical.level")# compare the values in the old and new fields
f.values.max(), f1.values.max()Fieldlist subsetting¶
fl.sel({"parameter.variable": "t"}).ls()Arithmetics¶
# select geopotential fields
z = fl.sel({"parameter.variable": "z"})We compute the geopotential height using fieldlist arithmetic. The result will be a new fieldlist entirely stored in memory.
# compute the geopotential height
# h and z are fieldlists
h = z / 9.81
# check the results
for zv, hv in zip(z, h):
print(f"z-max: {zv.values.max()} h-max: {hv.values.max()}")The metadata in the new “h” fields are still the same as in the “z” fields. We can correct it manually.
h.ls()h = h.set({"parameter.variable": "gh", "parameter.units": "gpm"})
h.ls()Fields can also be used in arithmetic expressions.
# compute the geopotential height
# h and z_first are fields
z_first = z[0]
h = z[0] / 9.81
f"z-max: {z_first.values.max()} h-max: {h.values.max()}"Converting to Xarray¶
For this exercise we fetch another GRIB file containing more variation in time and level.
# get GRIB data containing variations in date/time/step and level
d = ekd.from_source("file", "pl.grib")
fl = d.to_fieldlist()
fl.unique("parameter.variable", "time.base_datetime", "time.step", "vertical.level")We can convert fieldlists to Xarray using earthkit-data’s own Xarray engine. It comes with a very large number of options, but for our data the defaults are perfectly fine.
ds = fl.to_xarray()
dsThe Xarray we created uses the “forecasts” time dimension mode, with forecast_reference_time and step chosen as the temporal dimensions. The input data also forms a hypercube if we choose the valid datetime as the temporal dimension, so we can use the time_mode_dim option as follows:
ds = fl.to_xarray(time_dim_mode="valid_time")
dsXarrays created with earthkit-data have the earthkit accessor. It is an experimental feature and for each DataArray it stores earthkit specific metadata.
ds["t"].earthkit.grid_specConverting to Pandas¶
fl[0].to_pandas()