Routine generation of higher-level CHUK vegetation products from Sentinel-2 Analysis Ready Data (ARD)

Authors: Dr Omar Mohamed and Prof Jadu Dash (University of Southampton)

Keywords: Sentinel-2, Analysis Ready Data (ARD), Leaf Area Index (LAI), Vegetation monitoring, Atmospheric correction, SL2P

This case study evaluates the suitability of Sentinel-2 Analysis Ready Data (ARD) available on the Earth Observation Data Hub (EODH) for the routine generation of Leaf Area Index (LAI) vegetation products. Using the SL2P algorithm, we compared LAI products derived from standard Sentinel-2 surface reflectance against those derived from EODH ARD data across four land cover types (grassland, arable, coniferous, and broadleaved forests), East Anglia, UK over a full growing season (January to December 2024). Results demonstrate that ARD-based LAI products are systematically higher than standard products, with differences ranging from +1.04 (grassland) to +2.10 (broadleaved forests). The bias increases with vegetation density, confirming that improved atmospheric correction in ARD products recovers more signal in complex canopies. EODH enabled consistent, analysis-ready data access across the study period, facilitating robust time series analysis without extensive preprocessing.

Picture1
Picture5

Analysis Ready Data: Reducing Preprocessing Effort

The primary objective of this study was to evaluate whether Sentinel-2 Analysis Ready Data (ARD) available on EODH is suitable for routine generation of high-quality Leaf Area Index (LAI) vegetation products. LAI is a critical biophysical parameter for monitoring vegetation health, crop growth, and ecosystem productivity. The existing CHUK LAI Products rely on standard Sentinel-2 surface reflectance, which requires extensive preprocessing. This study tested whether ARD products, with their built-in atmospheric correction and geometric registration, could produce more accurate and consistent LAI estimates while reducing preprocessing burden.

Open Access EO Datasets For Research

The study covered the period from January to December 2024, encompassing a full growing season in East Anglia, United Kingdom. Two data sources were compared:

CHUK LAI products relying. Generated using the SL2P algorithm applied to standard Copernicus Sentinel-2 surface reflectance


EODH Sentinel-2 ARD LAI products. Generated using the same SL2P algorithm applied to EODH Analysis Ready Data

Both datasets are open access and were processed using the identical SL2P radiative transfer models to isolate the impact of input surface reflectance preprocessing.

Comparing Leaf Area Index Values

The comparison followed a rigorous four-stage analytical framework:

  1. Pixel-by-Pixel Difference Analysis: For each date, we calculated ΔLAI = LAIARD - LAIStandard, generated difference histograms, and created spatial maps to identify systematic biases.
  2. Land Cover Stratification: Using UK-wide land cover masks (broadleaves, coniferous, arable, grassland), we stratified the analysis to evaluate product performance across vegetation types. This was critical because SL2P is known to underestimate LAI in complex canopies.
  3. Time Series Analysis: For each land cover type, we extracted LAI values from both products across all available dates and compared temporal trajectories, focusing on seasonal patterns and noise levels.
  4. Quantitative Metrics: We calculated bias (mean ΔLAI), Root Mean Square Difference (RMSD), and correlation coefficients to quantify agreement between products.
Picture2

Figure 1: Comparison of mean LAI values derived from EODH ARD (orange) and standard Sentinel-2 (blue) products across four land cover types. Error bars represent standard deviation. ARD produces systematically higher LAI values, with the largest difference observed in broadleaved forests (+2.10, 105% relative difference).

The analysis was implemented in Python 3.10 using open-source libraries including Rasterio for geospatial raster processing, GeoPandas for shapefile manipulation, PyProj for coordinate reference system transformations, and Matplotlib/Seaborn for visualization. All processing was performed on the University of Southampton's high-performance computing infrastructure.

Systematic Bias. EODH ARD products produced consistently higher LAI values across all land cover types (Figure 1). The bias ranged from +1.04 (grassland) to +2.10 (broadleaved forests), representing relative differences of 59-105%.

Vegetation Density Relationship. The bias increased with vegetation density (grassland < arable < coniferous < broadleaves), confirming that ARD's improved atmospheric correction recovers more signal in complex canopies where standard SL2P underestimation is most severe.

Spatial Consistency. Analysis of 28 million valid pixels showed that the difference is systematic rather than random, with standard deviations of 0.87-1.75 across land cover types (Figure 3).

Temporal Patterns. Time series analysis revealed that ARD products capture similar phenological patterns with reduced noise compared to standard products, validating the value of ARD for vegetation monitoring (Figure 2).

Picture3

Figure 2: Time series analysis for Coniferous forest showing (a) LAI trajectories for ARD and standard products across the 2024 growing season and (b) ΔLAI over time demonstrating consistent positive bias

How EODH supported the research

The Earth Observation Data Hub (EODH) was instrumental to this project in several ways:

Analysis-Ready Data. EODH provided Sentinel-2 ARD products with pre-applied atmospheric correction, cloud masking, and geometric registration, eliminating the need for extensive preprocessing that would have required weeks of additional work.

Consistent Time Series. Access of consistently processed ARD data enabled robust time series analysis across the full growing season, which would have been challenging to assemble from raw Sentinel-2 data.

Reproducibility. The standardized ARD format ensures that our methodology can be replicated by other researchers using EODH data, supporting open science principles.

Scalability. EODH's data delivery infrastructure will enable efficient processing of large-scale (over the UK) analyses without requiring local storage of raw Sentinel-2 archives.

Operationally Scalable Vegetation Monitoring

This study demonstrates that Sentinel-2 ARD products available on EODH are highly suitable for routine vegetation product generation. The systematically higher LAI values from ARD, particularly in complex forest canopies, suggest that improved atmospheric correction (FORCE algorithm) yields more accurate estimates of vegetation structure. For operational vegetation monitoring, we recommend adopting EODH ARD as the primary input, as it reduces preprocessing requirements while potentially improving product accuracy.


Furthermore, EODH's data delivery infrastructure will enable efficient processing of large-scale analyses across the entire United Kingdom without requiring local storage of raw Sentinel-2 archives. This capability addresses a significant challenge in traditional Earth observation workflows, where downloading and managing petabyte-scale raw data archives often limits the spatial and temporal scope of research.

Future work will extend this analysis to:

Other biophysical parameters (e.g., fractional vegetation cover, fAPAR)

Validation against field measurements

Scaling the methodology to produce LAI maps for the whole UK using EODH's ARD archive

Picture4

Figure 3: Spatial map of ΔLAI (ARD minus Standard) for East Anglia on June 21, 2024. Red areas indicate where ARD produces higher LAI values, concentrated in forested regions. Blue areas show where standard product yields higher values. The spatial pattern confirms that ARD's improved atmospheric correction has the greatest impact in complex vegetated landscapes