Satlas: Monitoring the Planet with AI and Satellite Imagery

Favyen Bastani
AI2 Blog
Published in
4 min readAug 31, 2023

--

The Allen Institute for AI is excited to announce Satlas, a new platform for exploring global geospatial data generated by AI from satellite imagery. Currently, Satlas includes three data products that are updated monthly:

The data is released under an open license (ODC-BY), and can be visualized in the Satlas Map or downloaded for offline analysis. Over time, we plan to release additional geospatial data products.

Offshore wind turbines in Europe detected by AI, displayed in the Satlas Map.
Offshore wind turbines in Europe detected by AI, displayed in the Satlas Map.

Timely geospatial data is critical for informing decisions in emissions reduction, disaster relief, urban planning, and more. For example, renewable energy infrastructure is rapidly expanding across the planet, and accurately tracking this growth across political boundaries is important for prioritizing resources and funds.

High-quality global geospatial data products can be hard to find, however. Manually curating these products involves tediously aggregating, cleaning, and correcting regional datasets from dozens of countries, assuming these datasets even exist. Indeed, existing geospatial data on renewable energy infrastructure is fragmented, and is only up-to-date in limited geographies.

An alternative is to analyze satellite images from sources like Landsat and Sentinel-2, which capture images covering most of the Earth every few days. Manual analysis is infeasible due to the sheer volume of data. Automatic analysis of these images has long been error-prone due to their low resolution, where a single pixel corresponds to a 100 m² plot of land.

A Sentinel-2 image of Seattle, highlighting the low 10 m/pixel resolution of Sentinel-2 images.
A Sentinel-2 image of downtown Seattle (ESA, 2022). The low resolution makes automatic analysis challenging.

However, given enough training examples, modern deep learning methods can extract data like the positions of wind turbines from satellite imagery as accurately as humans can. Thus, to power Satlas, we have developed a high-accuracy deep learning model for each of the geospatial data products above. Every month, Satlas applies these models on Sentinel-2 images to derive a new, up-to-date global snapshot of each geospatial data product.

We anticipate that the geospatial data products in Satlas will be useful for a wide range of planetary and environmental monitoring applications. Skylight at AI2 is already exploring whether the marine infrastructure data can be used to improve the classification of vessel movement trajectories, and we are actively looking for other use cases.

Training Data

Developing high-accuracy deep learning models depends on a large number of high-quality training examples. We have manually labeled 36K wind turbines, 4K solar farms, 7K offshore platforms, and 3K tree cover canopy percentages in Sentinel-2 imagery. We have openly released these training examples, along with the model weights that were learned from them.

Foundation Models for Sentinel-2

To maximize accuracy, we leverage foundation models for Sentinel-2 that we pre-trained on a large-scale remote sensing dataset, SatlasPretrain. SatlasPretrain combines several terabytes of Sentinel-2 images with 302 million labels. The foundation models are trained to simultaneously perform over a hundred tasks, including land cover segmentation, crop type classification, and building detection.

The diversity of the images and tasks in SatlasPretrain enables these foundation models to learn descriptive representations of Sentinel-2 satellite images that are robust over different seasons and geographies. This means that, when we take a foundation model and train it to perform one particular task very well (like detecting solar farms), it offers better and more consistent performance than another model that was trained from scratch.

Examples of labels in SatlasPretrain. SatlasPretrain includes points and polygons like storage tanks and buildings, polylines like roads and rivers, properties of objects like whether a power plant uses gas or coal, segmentation labels like land cover, regression labels like tree cover, and classification labels like snow presence.
SatlasPretrain contains diverse images and labels.

We have released our Sentinel-2 foundation models along with the SatlasPretrain dataset on which they were trained. SatlasPretrain will appear at the International Conference on Computer Vision in October 2023.

Super-Resolution

We have also begun exploring how to enhance the detail of low-resolution but frequent satellite imagery from Sentinel-2, which we call super-resolution. In multi-frame super-resolution, we use deep learning models to generate a high-resolution image from many low-resolution images of the same location captured at different times. The model tries to combine information across low-resolution images to predict sub-pixel details.

The super-resolution image contains much more detail than the noticeably lower-resolution Sentinel-2 image.
One Sentinel-2 input image (left) and the super-resolution output (right).

We have computed output images from our current super-resolution model globally, and these can be viewed on the Satlas website. We plan to continue exploring methods for improving and quantifying accuracy.

What’s in store for the future?

Our primary goal in the short term is to add more geospatial data products to Satlas. We are currently exploring models for mapping urban land use, crop types, and land cover, and hope to incorporate a subset of these into Satlas by the end of 2023. We’re also continuing to work on improving the accuracy of the existing data.

In the long term, we plan to release tools that make it easier for other teams to build similar geospatial data products, including annotating examples, training models, and deploying them.

Check out our current openings, follow @allen_ai on Twitter/X, and subscribe to the AI2 Newsletter to stay current on news and research coming out of AI2.

--

--