AI for Earth, 2023 in Review

Patrick Beukema
AI2 Blog
Published in
7 min readDec 31, 2023

--

It was not an easy year for our planet. It was the warmest year on record (again) with extreme heat hitting North America, Europe, and China. Forty-seven million acres burned in the Canadian wildfires, nearly three times the previous record. There were over two dozen extreme weather and climate related disasters in the United States alone, which resulted in losses totaling more than $60,000,000,000. Ocean temperatures keep rising, faster than expected, threatening marine life and resulting in losses in biodiversity. Eleven billion snow crabs vanished from the Bering Sea, causing Alaska to cancel its crabbing season for the first time in history. There is no shortage of dire headlines and causes for alarm. But, in the last year, we have also witnessed AI-enabled breakthroughs that are changing the landscape of climate change and sustainability research and impact. AI is increasingly seen as a key to solving many of our greatest environmental challenges.

Here are some of the key themes and highlights of the year, which have given me reasons for hope and optimism. I hope they will do the same for you.

A triptych: an ocean, a forest, and a mountain.
Photos by Gatis Marcinkevics, samsommer, Olena Bohovyk, on Unsplash

“We are seeing increasing evidence that artificial intelligence can prove an invaluable instrument in tackling climate change.” — UN Climate Change Executive Secretary Simon Stiell, December 2023

Commitment to transform research into actionable strategies

At conferences and workshops throughout the year, researchers outlined pragmatic strategies to translate research into action. We must prioritize “use-inspired basic research,” Zaira Razu-Aznar emphasized in her keynote at this year’s NeurIPS climate change meeting. It is not sufficient to build state-of-the-art models; 1) we need to build solutions that directly address specific technical problems, and 2) we need to deploy them. At the inaugural NeurIPS sustainability meeting, David Rolnick distinguished the principles of impact-oriented innovation from the canonical paradigm used in ML research. Impact-guided innovation uses data from real problems rather than stereotyped datasets and measures success by how the model will be used in practice rather than against a leaderboard.

Scientific quadrant with Bohr (basic research), Pasteur (use-inspired basic research), and Edison (applied research).
Emphasis on Pasteur’s quadrant, upper right, from Zaira Razu-Aznar’s keynote

To put the use in “use-inspired”, models must be deployed, and therefore engineering should be understood as a central component of impactful research. It is commonplace to treat research and engineering as independent efforts, but for impact-oriented ML, they are inextricably linked and self-reinforcing. The most effective initiatives will consider the entire process of research and deployment holistically when designing solutions. Siloed development and colonial science exacerbate the lack of real-world translation and reinforces inequities. Therefore, climate and sustainability practitioners must be part of the process from the outset.

Artificial intelligence, combined with the massive quantity of publicly available satellite imagery, presents many opportunities for real world impact. For example, plastic debris in our oceans is a well known threat to our health and the health of marine ecosystems, and this month a lab from EPFL created a computer vision model that can detect marine plastic at global scale [blog, paper, code]. The same lab is partnering with the The Ocean Cleanup (which launched System03 last Summer) to identify regions containing the highest density of plastics in the Pacific.

System03, which can clean a football field every five seconds (The Ocean Cleanup)

One of the critical factors when designing solutions for real-world applications is the latency of information, which typically receives little attention in research papers. Low latency is especially critical for humanitarian assistance after climate-related disasters. After the Maui wildfires, Microsoft’s AI4Good lab built and released a computer vision model capable of detecting damaged buildings from satellite imagery in less than a day that was then put to use by the Red Cross [demo].

Foundation models for climate, weather, and geospatial data

2023 was the year of many “firsts” for environmental AI. In January, UCLA and Microsoft teamed up to released ClimaX, the first foundation model for weather and climate [blog, paper, code]. Microsoft’s geospatial team released the first foundation model for Landsat imagery alongside the largest Landsat dataset ever [paper, code]. AI2’s geospatial team released one of the largest and diverse satellite imagery datasets ever composed of Sentinel-2 and NAIP images with 302M labels under 137 categories and seven label types [paper, demo, code]. This year, we also saw NASA release their first geospatial model, as part of a deepening AI collaboration with IBM, [paper, code, demo], and the first billion parameter remote sensing foundation [paper]. Notably, with a few exceptions, transformer based architectures are now the dominant choice for geospatial modeling (especially ViTs, and most recently Swin based models).

Thanks to NASA, NOAA, and the European Space agency, researchers have access to an unprecedented volume of publicly available satellite imagery and remote sensing data for global scale modeling and monitoring. Increases in both spatial resolution resulting from sensor upgrades and temporal resolution resulting from increasing numbers of satellites have led to exponential increases in the available data that is suitable for oceanic, atmospheric, and geophysical modeling.

Petabytes of publicly available remote sensing data

Commitment to openness and transparency in Environmental AI

Overwhelmingly, researchers in environmental AI are choosing to open source their models, datasets, and code, including at corporations where the open source debate has been heated. This Summer, DeepMind released GraphCast, a state of the art AI-based weather prediction model that makes global forecasts in under 1 minute, at a fraction of the energy cost compared to conventional numerical/physics based models. Weather prediction is big business, and there are many AI startups actively competing for VC money all touting their own state of the art weather prediction model. The code to inference and train GraphCast, along with the model weights, were open sourced under a permissive license (Apache 2.0) and the model is already in use by the European Centre for Medium Range Weather Forecasts [blog, paper, code].

Fast and accurate weather predictions have myriad use cases and are critical to providing advance warning of extreme storms, thereby saving lives. At the bleeding edge of climate science, AI is also used to predict climate at the scale of decades. For instance, NVIDIA and AI2 teamed up on a climate emulator capable of generating stable predictions out to 100 years [blog, paper, code]. High performance models require large scale, high quality, and AI-ready datasets. Earlier this month, Mila released one of the largest and most comprehensive AI-ready datasets for climate and weather modeling ever called ClimateSet [blog, paper, code].

ECMWF

Because geospatial modeling relies heavily on computer vision, research in fundamental computer vision can transfer to geospatial specific applications. Techniques in super-resolution and semantic segmentation, for example, have advanced both weather modeling and geospatial computer vision. And the speed can be breakneck. In April, Meta released the most advanced object segmentation tool ever, a foundation model called Segment Anything (SAM). Thanks to FAIR (and LeCun and his public advocacy for actually open AI), the model, data, and code were all open-sourced, and a geospatial specific version suited for satellite imagery was independently created less than one week later.

While open-sourcing code does not guarantee transparency, and both are more of a spectrum than a binary threshold, it is encouraging that both academic and industrial labs are open-sourcing commercially relevant models under permissive licenses, especially when you consider the contentiousness of the same question for LLMs.

“Climate change happens in public. Risk assessments should, too.” — Orianna Chedwiggen (Carbonplan.org)

The acceleration in climate and sustainability related AI and ML should be encouraging, but the picture is not all sunshine and rainbows. VC funding crashed hard in 2022/2023 and climate tech was not immune, a sobering fact considering it was the warmest year on record.

And, while artificial intelligence may be necessary to solve many climate and sustainability related problems, its use does not guarantee progress. AI could easily exacerbate existing inequities. The use of LLMs is likely to become ubiquitous, including in the service of environmental health. The trend of the ML community to build ever larger models poses serious risks. Energy consumption and computational efficiency should always be on one’s mind when designing AI based solutions and when leveraging those technologies as a user. While you might not be personally paying the real cost of an LLM query thanks to obfuscated investor-based subsidization, the buck will eventually stop at the planet, and the energy expenditure for both training and inference is extraordinarily expensive.

Threats to our planet’s health affect us all. But there are growing numbers of researchers and scientists working together to apply one of the most transformative technologies in history to our greatest planetary challenges.

Special thanks to AI2’s 2023 Environmental AI speakers: David Rolnick (McGill, Mila), Devis Tuia (EPFL), Anthony Ortiz (Microsoft AI4Good), Stephen Mandt (UCI), Sarah Stone (eScience), Mike Gartner (AI2), Favyen Bastani (AI2), and Orianna Chedwiggen (Carbonplan).

Follow @allen_ai and @SkylightMarine on Twitter/X and subscribe to the AI2 Newsletter to stay current on news and research coming out of AI2.

--

--

AI lead @AI2. Ph.D. CMU. McGill. Previous, tech, academia, startup. Please see https://substack.com/@codegreenmedia for future updates