Rapid Tree Canopy Detection

Authors: Gareth Bennett, Jonathan Hendry (4EI Ltd)

Keywords: Environment, Planet SkySat, Pixel Classification, Deep Learning Models, Climate Adaptation, Climate Resilience, Nature Value-at-Risk (NVaR)

Urban greening initiatives require accurate, scalable baselines of tree canopy coverage. This case study demonstrates a rapid, automated workflow for detecting and classifying urban tree canopies in the borough of Lambeth, London, UK, during the peak leaf-on season (May–September 2022) leveraging the cloud-computing power of the Earth Observation DataHub (EODH). Our methodology utilised a U-Net deep learning model trained on high-resolution commercial multispectral imagery from Planet SkySat (50cm). By processing this dataset through EODH’s scalable GPUs, the model rapidly generated highly accurate tree canopy masks, validated via the Intersection over Union (IoU) metric. This approach provides essential data to help manage urban heat islands, tree inequality, biodiversity and green corridors. Future work will integrate open-access Sentinel-2 data to monitor seasonal changes and expand the model's capabilities to include other commercial sources of optical imagery.

4EI_CaseStudy_CoverPhoto

Figure 1: Overview of the tree canopy extent detection over Stockwell, Lambeth

The Challenge: Mapping The Urban Forest In Lambeth

In densely populated urban environments like the London borough of Lambeth, tracking municipal greening initiatives is crucial for managing urban heat islands and improving air quality. Trees provide essential localised cooling, support urban biodiversity, and help mitigate localised flooding through stormwater retention. However, establishing an accurate, up-to-date baseline of tree canopy coverage using manual ground surveys is notoriously time-intensive, expensive, and difficult to scale. Our primary goal was to solve this bottleneck by rapidly training and validating a U-Net deep learning model capable of automatically detecting and classifying the precise extent of urban tree canopies across the entire borough.

4EI_CaseStudy_LocationImage

Figure 2: Lambeth Borough Council Boundary overlain on the ESRI Standard Basemap

Sourcing The Right Data At The Right Time

To ensure the model had the best possible visual data, we focused our temporal coverage strictly on May to September. This window captures the UK's peak leaf-on season, providing the highest visual contrast for the deep learning model to accurately delineate canopy boundaries against man-made infrastructure.

For the base imagery, we utilised Planet SkySat, a commercial data provider that offers 50cm high-resolution multispectral imagery. This specific dataset was chosen because it perfectly balances immense detail with manageable data volumes. Crucially, SkySat provides a Near-Infrared (NIR) band alongside standard RGB. The NIR band is highly reflective for healthy vegetation, serving as a powerful distinguishing feature that drives our classification efforts while ensuring high-frequency, cloud-free coverage over Lambeth.

How can EO imagery be used to detect tree canopies?

🌈 High reflection in the Near-Infrared band indicates healthy vegetation

🍃 Leaf-on season creates the highest visual contrast against an urban backdrop

🌳 Image resolution must be high enough to spot individual tree crowns

What 4EI needed from an EO processing platform
  • A UK-based cloud computing platform to migrate and scale their heavy duty processing, reducing dependencies on internationally hosted infrastructure
  • Ability to generate trustable Earth Observation insights for clients
  • Environment to work with sub‑metre commercial multispectral imagery, which creates large, memory‑intensive datasets

Leveraging JupyterHub Processing Infrastructure On EODH

Processing sub-meter high-resolution satellite imagery generates massive, memory-intensive data arrays that can easily overwhelm standard local workstations. Therefore, robust compute power and efficient data handling were imperative. We utilised the Earth Observation DataHub (EODH) JupyterHub environment, taking advantage of its seamless cloud storage access. To handle the immense computational load, the project relied on TensorFlow integrated with Dask for distributed processing. This powerful combination allowed us to scale our U-Net architecture seamlessly, utilising a cloud environment provisioned with up to 4 GPUs explicitly dedicated to training, testing, and inference.

4EI_CaseStudy_ZoomedImage

Figure 3: Detection of tree canopy extent around Kennington Oval

Distributed Cloud Computing for Earth Observation with Dask

By utilising cloud-native tools and distributed computing, we established a highly efficient, end-to-end workflow:

⬅️ Data Ingestion. We bypassed local storage bottlenecks by ingesting the massive Planet SkySat multispectral imagery directly from an S3 storage bucket into our EODH JupyterHub environment, enabling high-throughput data transfer.

💻 Preprocessing. The Lambeth geographical area was divided into standardised 512x512 pixel training tiles. Dask was instrumental here in managing the large multispectral arrays efficiently in memory, allowing us to seamlessly prepare the dataset without crashing the system.

🧪 Model Training & Testing. We fed the annotated tiles into our TensorFlow-based U-Net model. By splitting the dataset into 80% training and 20% validation, we distributed the heavy mathematical workload across the 4 GPUs. This parallelisation drastically accelerated training and testing times from days to mere hours.

🌳 Inference & Validation. Leveraging the same multi-GPU setup, we assessed the model using the Intersection over Union (IoU) metric against a holdout test set to measure exactly how well our model's predictions overlapped with human-verified ground truth data. This allowed us to rapidly generate a highly accurate canopy mask for the entire target region.

How EODH helped solve their challenge

☑️ Dask facilitated scalable cloud-computing that enabled training and inference of machine-learning models

☑️ Migrating existing scripts from alternative platforms was an easy and simple process

☑️ JupyterHub provided a flexible processing environment to work with Python packages like TensorFlow

☑️ Technical support teams provisioned additional GPUs on request

Future Scalability With Novel Datasets

This case study proves the viability of rapid, automated canopy mapping at a localised borough level using distributed cloud computing. For future follow-on work, we plan to integrate even higher-resolution commercial datasets, such as the 30cm Airbus Pleiades Neo, to further refine canopy boundaries and detect smaller urban vegetation features like single street trees. Additionally, we plan to incorporate open-access datasets, like ESA Sentinel-2 imagery, to monitor seasonal phenology over time.

Purple 4EI - Core Logo