Keywords: Environment, Planet SkySat, Pixel Classification, Deep Learning Models, Climate Adaptation, Climate Resilience, Nature Value-at-Risk (NVaR)
Figure 1: Overview of the tree canopy extent detection over Stockwell, Lambeth
In densely populated urban environments like the London borough of Lambeth, tracking municipal greening initiatives is crucial for managing urban heat islands and improving air quality. Trees provide essential localised cooling, support urban biodiversity, and help mitigate localised flooding through stormwater retention. However, establishing an accurate, up-to-date baseline of tree canopy coverage using manual ground surveys is notoriously time-intensive, expensive, and difficult to scale. Our primary goal was to solve this bottleneck by rapidly training and validating a U-Net deep learning model capable of automatically detecting and classifying the precise extent of urban tree canopies across the entire borough.
Figure 2: Lambeth Borough Council Boundary overlain on the ESRI Standard Basemap
To ensure the model had the best possible visual data, we focused our temporal coverage strictly on May to September. This window captures the UK's peak leaf-on season, providing the highest visual contrast for the deep learning model to accurately delineate canopy boundaries against man-made infrastructure.
For the base imagery, we utilised Planet SkySat, a commercial data provider that offers 50cm high-resolution multispectral imagery. This specific dataset was chosen because it perfectly balances immense detail with manageable data volumes. Crucially, SkySat provides a Near-Infrared (NIR) band alongside standard RGB. The NIR band is highly reflective for healthy vegetation, serving as a powerful distinguishing feature that drives our classification efforts while ensuring high-frequency, cloud-free coverage over Lambeth.
🌈 High reflection in the Near-Infrared band indicates healthy vegetation
🍃 Leaf-on season creates the highest visual contrast against an urban backdrop
🌳 Image resolution must be high enough to spot individual tree crowns
Processing sub-meter high-resolution satellite imagery generates massive, memory-intensive data arrays that can easily overwhelm standard local workstations. Therefore, robust compute power and efficient data handling were imperative. We utilised the Earth Observation DataHub (EODH) JupyterHub environment, taking advantage of its seamless cloud storage access. To handle the immense computational load, the project relied on TensorFlow integrated with Dask for distributed processing. This powerful combination allowed us to scale our U-Net architecture seamlessly, utilising a cloud environment provisioned with up to 4 GPUs explicitly dedicated to training, testing, and inference.
Figure 3: Detection of tree canopy extent around Kennington Oval
By utilising cloud-native tools and distributed computing, we established a highly efficient, end-to-end workflow:
⬅️ Data Ingestion. We bypassed local storage bottlenecks by ingesting the massive Planet SkySat multispectral imagery directly from an S3 storage bucket into our EODH JupyterHub environment, enabling high-throughput data transfer.
💻 Preprocessing. The Lambeth geographical area was divided into standardised 512x512 pixel training tiles. Dask was instrumental here in managing the large multispectral arrays efficiently in memory, allowing us to seamlessly prepare the dataset without crashing the system.
🧪 Model Training & Testing. We fed the annotated tiles into our TensorFlow-based U-Net model. By splitting the dataset into 80% training and 20% validation, we distributed the heavy mathematical workload across the 4 GPUs. This parallelisation drastically accelerated training and testing times from days to mere hours.
🌳 Inference & Validation. Leveraging the same multi-GPU setup, we assessed the model using the Intersection over Union (IoU) metric against a holdout test set to measure exactly how well our model's predictions overlapped with human-verified ground truth data. This allowed us to rapidly generate a highly accurate canopy mask for the entire target region.
☑️ Dask facilitated scalable cloud-computing that enabled training and inference of machine-learning models
☑️ Migrating existing scripts from alternative platforms was an easy and simple process
☑️ JupyterHub provided a flexible processing environment to work with Python packages like TensorFlow
☑️ Technical support teams provisioned additional GPUs on request
This case study proves the viability of rapid, automated canopy mapping at a localised borough level using distributed cloud computing. For future follow-on work, we plan to integrate even higher-resolution commercial datasets, such as the 30cm Airbus Pleiades Neo, to further refine canopy boundaries and detect smaller urban vegetation features like single street trees. Additionally, we plan to incorporate open-access datasets, like ESA Sentinel-2 imagery, to monitor seasonal phenology over time.