This short how-to guides you through the steps to create a timeseries of the agricultural land fraction using satellite-based land cover data and save it as a csv file. The final csv file can be used as, e.g., exposure data for the drought risk estimation.
We use the high-resolution land cover dataset from Copernicus Climate Data Store: “Land cover classification gridded maps from 1992 to present derived from satellite observations” at 300m spatial resolution.
⏱️ Time needed! Requesting and downloading this high-resolution data for just one year takes about ~9 minutes. For all years currently available (1992-2023), the download takes about 4 hours.
Settings¶
User settings¶
admin_id = "EL64" # Example admin ID for Central Greece
# Year range for data download
start_year = 1992
end_year = 2022 # Adjust based on latest available dataSetup of environment¶
import cdsapi
import zipfile
import xarray as xr
import numpy as np
import pandas as pd
import geopandas as gpd
import regionmask
from pathlib import Path
import os
# Set up data directories
data_dir = Path("../data")
satellite_dir = data_dir / "satellite_landcover"
satellite_dir.mkdir(exist_ok=True)
print(f"\nSatellite data directory: {satellite_dir}")
Satellite data directory: ../data/satellite_landcover
Setup region specifics¶
# Read NUTS shapefiles
regions_dir = data_dir / 'regions'
nuts_shp = regions_dir / 'NUTS_RG_20M_2024_4326' / 'NUTS_RG_20M_2024_4326.shp'
nuts_gdf = gpd.read_file(nuts_shp)
# Select the region of interest
sel_gdf = nuts_gdf[nuts_gdf['NUTS_ID'] == admin_id]
print(f"Found {admin_id} region: {sel_gdf['NUTS_NAME'].values[0]}")
print(f"Bounding box: {sel_gdf.geometry.total_bounds}")
lon_min, lat_min, lon_max, lat_max = sel_gdf.geometry.total_bounds
# Create a regionmask from the admin region geometry
admin_mask = regionmask.from_geopandas(sel_gdf, names='NUTS_ID')Found EL64 region: Στερεά Ελλάδα
Bounding box: [21.39637798 37.98898161 24.67199242 39.27219519]
Download¶
Download satellite land cover data from the Copernicus Climate Data Store. The dataset contains two versions:
Version 2.0.7cds: Years 1992-2015
Version 2.1.1: Years 2016-present
Dataset information: https://
# Download land cover data for each year
for yyyy in range(start_year, end_year + 1):
print(f"Year {yyyy}: Downloading land cover data...")
# Check if file already exists
zip_file = satellite_dir / f'land_cover_{admin_id}_{yyyy}.zip'
if zip_file.exists():
print(f" - File {zip_file.name} already exists. Skipping download.")
continue
# Select version based on year
if yyyy <= 2015:
lc_version = "v2_0_7cds"
else:
lc_version = "v2_1_1"
dataset = "satellite-land-cover"
request = {
"variable": "all",
"year": [str(yyyy)],
"version": [lc_version],
"area": [lat_max, lon_min, lat_min, lon_max] # [North, West, South, East]
}
client = cdsapi.Client()
client.retrieve(dataset, request).download(str(zip_file))
print(f" - Downloaded {zip_file.name}")
print("\nLand cover download complete!")Year 1992: Downloading land cover data...
- File land_cover_EL64_1992.zip already exists. Skipping download.
Year 1993: Downloading land cover data...
- File land_cover_EL64_1993.zip already exists. Skipping download.
Year 1994: Downloading land cover data...
- File land_cover_EL64_1994.zip already exists. Skipping download.
Year 1995: Downloading land cover data...
- File land_cover_EL64_1995.zip already exists. Skipping download.
Year 1996: Downloading land cover data...
- File land_cover_EL64_1996.zip already exists. Skipping download.
Year 1997: Downloading land cover data...
- File land_cover_EL64_1997.zip already exists. Skipping download.
Year 1998: Downloading land cover data...
- File land_cover_EL64_1998.zip already exists. Skipping download.
Year 1999: Downloading land cover data...
- File land_cover_EL64_1999.zip already exists. Skipping download.
Year 2000: Downloading land cover data...
- File land_cover_EL64_2000.zip already exists. Skipping download.
Year 2001: Downloading land cover data...
- File land_cover_EL64_2001.zip already exists. Skipping download.
Year 2002: Downloading land cover data...
- File land_cover_EL64_2002.zip already exists. Skipping download.
Year 2003: Downloading land cover data...
- File land_cover_EL64_2003.zip already exists. Skipping download.
Year 2004: Downloading land cover data...
- File land_cover_EL64_2004.zip already exists. Skipping download.
Year 2005: Downloading land cover data...
- File land_cover_EL64_2005.zip already exists. Skipping download.
Year 2006: Downloading land cover data...
- File land_cover_EL64_2006.zip already exists. Skipping download.
Year 2007: Downloading land cover data...
- File land_cover_EL64_2007.zip already exists. Skipping download.
Year 2008: Downloading land cover data...
- File land_cover_EL64_2008.zip already exists. Skipping download.
Year 2009: Downloading land cover data...
- File land_cover_EL64_2009.zip already exists. Skipping download.
Year 2010: Downloading land cover data...
- File land_cover_EL64_2010.zip already exists. Skipping download.
Year 2011: Downloading land cover data...
- File land_cover_EL64_2011.zip already exists. Skipping download.
Year 2012: Downloading land cover data...
- File land_cover_EL64_2012.zip already exists. Skipping download.
Year 2013: Downloading land cover data...
- File land_cover_EL64_2013.zip already exists. Skipping download.
Year 2014: Downloading land cover data...
- File land_cover_EL64_2014.zip already exists. Skipping download.
Year 2015: Downloading land cover data...
- File land_cover_EL64_2015.zip already exists. Skipping download.
Year 2016: Downloading land cover data...
- File land_cover_EL64_2016.zip already exists. Skipping download.
Year 2017: Downloading land cover data...
- File land_cover_EL64_2017.zip already exists. Skipping download.
Year 2018: Downloading land cover data...
- File land_cover_EL64_2018.zip already exists. Skipping download.
Year 2019: Downloading land cover data...
- File land_cover_EL64_2019.zip already exists. Skipping download.
Year 2020: Downloading land cover data...
- File land_cover_EL64_2020.zip already exists. Skipping download.
Year 2021: Downloading land cover data...
- File land_cover_EL64_2021.zip already exists. Skipping download.
Year 2022: Downloading land cover data...
- File land_cover_EL64_2022.zip already exists. Skipping download.
Land cover download complete!
Process¶
Create region mask¶
# Inspect one zip file to create mask
first_year = start_year
zip_file = satellite_dir / f'land_cover_{admin_id}_{first_year}.zip'
with zipfile.ZipFile(zip_file, 'r') as zip_ref:
# List all files in the zip
file_list = zip_ref.namelist()
print(f"Files in {zip_file.name}:")
for f in file_list:
print(f" - {f}")
# Find the NetCDF file
nc_files_in_zip = [f for f in file_list if f.endswith('.nc')]
if nc_files_in_zip:
lc_filename = nc_files_in_zip[0]
# Extract only this file
zip_ref.extract(lc_filename, satellite_dir)
else:
print("No NetCDF file found in zip!")
# Load land cover data to get grid
lc_file = satellite_dir / lc_filename
print(f"\nUsing file: {lc_file.name}")
lc_ds = xr.open_dataset(lc_file)
# Subset to our region (initial bounding box)
lc_region = lc_ds.sel(
lon=slice(lon_min, lon_max),
lat=slice(lat_max, lat_min)
)
# Create admin region mask: 1 where inside region, 0 elsewhere
lc_admin_mask_raw = admin_mask.mask(lc_region.lon, lc_region.lat)
lc_admin_mask = xr.DataArray(
(~np.isnan(lc_admin_mask_raw.values)).astype(float),
coords={'lat': lc_region.lat, 'lon': lc_region.lon},
dims=['lat', 'lon']
)
print(f"\nAdmin region area (grid cells): {lc_admin_mask.sum().values:.0f}")
print(f"Grid shape: {lc_admin_mask.shape}")
# Clean up extracted file
lc_file.unlink()Files in land_cover_EL64_1992.zip:
- ESACCI-LC-L4-LCCS-Map-300m-P1Y-1992-v2.0.7cds.area-subset.39.27219519300007.24.671992415000034.37.98898161400007.21.396377976000053.nc
Using file: ESACCI-LC-L4-LCCS-Map-300m-P1Y-1992-v2.0.7cds.area-subset.39.27219519300007.24.671992415000034.37.98898161400007.21.396377976000053.nc
/usr/local/apps/python3/3.12.11-01/lib/python3.12/site-packages/gribapi/__init__.py:23: UserWarning: ecCodes 2.42.0 or higher is recommended. You are running version 2.31.0
warnings.warn(
Admin region area (grid cells): 215798
Grid shape: (462, 1179)
Calculate agricultural land fraction¶
Agricultural land corresponds to LCCS classes 10-40 in the satellite dataset.
# Initialize list to store results
ag_data = []
latest_ag_mask = None # Store latest year for saving as raster
latest_lc_region = None # Store latest region data for coordinate info
for yyyy in range(start_year, end_year + 1):
print(f"Processing year {yyyy}...")
# Extract and load NetCDF file
zip_file = satellite_dir / f'land_cover_{admin_id}_{yyyy}.zip'
with zipfile.ZipFile(zip_file, 'r') as zip_ref:
file_list = zip_ref.namelist()
nc_files_in_zip = [f for f in file_list if f.endswith('.nc')]
if nc_files_in_zip:
lc_filename = nc_files_in_zip[0]
zip_ref.extract(lc_filename, satellite_dir)
else:
print(f" - No NetCDF file found in {zip_file.name}. Skipping.")
continue
# Load land cover data
lc_file = satellite_dir / lc_filename
lc_ds = xr.open_dataset(lc_file)
# Subset to our region
lc_region = lc_ds.sel(
lon=slice(lon_min, lon_max),
lat=slice(lat_max, lat_min)
)
# Create agricultural land mask (LCCS classes 10-40)
ag_mask = ((lc_region['lccs_class'] >= 10) & (lc_region['lccs_class'] <= 40)).astype(float)
# Calculate agricultural land fraction within admin region
ag_fraction = (ag_mask * lc_admin_mask).sum().values / lc_admin_mask.sum().values
print(f" - Agricultural land coverage: {ag_fraction*100:.1f}%")
# Store result
ag_data.append({'year': yyyy, 'ag_fraction': ag_fraction})
# Save latest year for raster output
if yyyy == end_year:
latest_ag_mask = ag_mask.copy()
latest_lc_region = lc_region.copy()
# Clean up extracted file
lc_file.unlink()
# Create pandas DataFrame
ag_timeseries = pd.DataFrame(ag_data)
print("\nAgricultural land timeseries:")
print(ag_timeseries)Processing year 1992...
- Agricultural land coverage: 30.5%
Processing year 1993...
- Agricultural land coverage: 30.5%
Processing year 1994...
- Agricultural land coverage: 30.5%
Processing year 1995...
- Agricultural land coverage: 30.1%
Processing year 1996...
- Agricultural land coverage: 29.8%
Processing year 1997...
- Agricultural land coverage: 29.6%
Processing year 1998...
- Agricultural land coverage: 29.4%
Processing year 1999...
- Agricultural land coverage: 28.7%
Processing year 2000...
- Agricultural land coverage: 28.5%
Processing year 2001...
- Agricultural land coverage: 28.4%
Processing year 2002...
- Agricultural land coverage: 28.3%
Processing year 2003...
- Agricultural land coverage: 28.3%
Processing year 2004...
- Agricultural land coverage: 28.2%
Processing year 2005...
- Agricultural land coverage: 28.2%
Processing year 2006...
- Agricultural land coverage: 28.1%
Processing year 2007...
- Agricultural land coverage: 28.1%
Processing year 2008...
- Agricultural land coverage: 27.9%
Processing year 2009...
- Agricultural land coverage: 27.9%
Processing year 2010...
- Agricultural land coverage: 27.9%
Processing year 2011...
- Agricultural land coverage: 27.9%
Processing year 2012...
- Agricultural land coverage: 27.9%
Processing year 2013...
- Agricultural land coverage: 27.9%
Processing year 2014...
- Agricultural land coverage: 27.8%
Processing year 2015...
- Agricultural land coverage: 27.8%
Processing year 2016...
- Agricultural land coverage: 27.8%
Processing year 2017...
- Agricultural land coverage: 27.8%
Processing year 2018...
- Agricultural land coverage: 27.8%
Processing year 2019...
- Agricultural land coverage: 27.7%
Processing year 2020...
- Agricultural land coverage: 27.7%
Processing year 2021...
- Agricultural land coverage: 27.7%
Processing year 2022...
- Agricultural land coverage: 27.7%
Agricultural land timeseries:
year ag_fraction
0 1992 0.304581
1 1993 0.304572
2 1994 0.304567
3 1995 0.301157
4 1996 0.298070
5 1997 0.296245
6 1998 0.293946
7 1999 0.287148
8 2000 0.284567
9 2001 0.283580
10 2002 0.283093
11 2003 0.282551
12 2004 0.282000
13 2005 0.281638
14 2006 0.281384
15 2007 0.281008
16 2008 0.279372
17 2009 0.279048
18 2010 0.278890
19 2011 0.278649
20 2012 0.278548
21 2013 0.278506
22 2014 0.278265
23 2015 0.278200
24 2016 0.277964
25 2017 0.277709
26 2018 0.277560
27 2019 0.277074
28 2020 0.276944
29 2021 0.277014
30 2022 0.276912
Save¶
Save results to CSV¶
# Create output directory
output_dir = data_dir / admin_id / 'satellite_land_cover'
output_dir.mkdir(parents=True, exist_ok=True)
# Save to CSV
csv_file = output_dir / f'agricultural_land_fraction_{admin_id}.csv'
ag_timeseries.to_csv(csv_file, index=False)
print(f"Saved agricultural land fraction timeseries to: {csv_file}")
print(f"\nSummary statistics:")
print(ag_timeseries['ag_fraction'].describe())Saved agricultural land fraction timeseries to: data/EL64/satellite_land_cover/agricultural_land_fraction_EL64.csv
Summary statistics:
count 31.000000
mean 0.284541
std 0.009277
min 0.276912
25% 0.278232
50% 0.281008
75% 0.285858
max 0.304581
Name: ag_fraction, dtype: float64
Save spatial raster (latest year)¶
# Save the most recent year as GeoTIFF for use in tutorial
if latest_ag_mask is not None and latest_lc_region is not None:
# Set spatial dimensions and add CRS information
latest_ag_mask = latest_ag_mask.rio.set_spatial_dims(x_dim='lon', y_dim='lat')
latest_ag_mask.rio.write_crs("EPSG:4326", inplace=True)
# Save as GeoTIFF
tif_file = output_dir / f"satellite_landcover_{admin_id}.tif"
latest_ag_mask.rio.to_raster(tif_file, driver="GTiff")
print(f"Saved {end_year} satellite land cover raster to: {tif_file}")
else:
print("No latest year data available to save as raster.")Saved 2022 satellite land cover raster to: data/EL64/satellite_land_cover/satellite_landcover_EL64.tif
Output summary¶
Output files created:
agricultural_land_fraction_{admin_id}.csv- Agricultural land fraction timeseries (1992-2022)satellite_landcover_{admin_id}.tif- Spatial raster of satellite land cover (latest year)
These files can now be used in the agricultural land tutorial for visualization and comparison with CORINE data.