New data set accelerates search for renewable energy sources
Apr 19, 2022
Transforming renewable resources to other fuels, such as hydrogen, is one scalable solution to energy challenges posed by climate change. However, to become widely adopted, low-cost catalysts are needed to drive the necessary chemical reactions at high rates. Unfortunately, finding new catalysts is a highly time and resource-intensive process. Conventional methods, for example, allow researchers to computationally evaluate tens of thousands of chemical structures per year — yet there are billions of possible combinations of elements to test and consider.
To address this challenge, Meta AI and Carnegie Mellon University’s (CMU) Department of Chemical Engineering have collaborated on the Open Catalyst Project, which aims to build machine learning (ML) models that simulate chemical reactions and accelerate the discovery of low-cost catalysts. Historically, a lack of sufficient training data sets has been a roadblock for researchers developing these ML models. Through this project, Facebook AI and CMU are making progress by open-sourcing OC20, the world’s largest training data set of materials for renewable energy storage.
Recently, the collaboration announced an entirely new data set focused on oxide catalysts for the Oxygen Evolution Reaction (OER), a critical chemical reaction used in green hydrogen fuel production via wind and solar energy. The OER data set contains 8 million data points from 40 thousand unique simulations — the largest data set for oxide catalysis to date — spanning a swath of oxide materials across 52 elements. It includes interactions between the surfaces of the oxides materials and five important molecules (O, OH, H2O, OOH, O2) involved in OER, in addition to surface interactions with CO, H, C, and N. It also explores interactions on the surface when crystal defects and multiple molecules are present. In the coming months, the data set and baseline models will be open-sourced, to help the global scientific community advance renewable energy technologies.
To identify promising catalysts, research scientists traditionally use quantum mechanical simulation tools like Density Functional Theory (DFT) to predict adsorption energies of small molecules on potential catalysts, a crucial property in determining how effective the catalyst will be. DFT uses quantum mechanics to simulate the movement of atoms in a given scenario, iteratively moving the positions of atoms in the system until they reach their lowest energy configuration, also known as a “relaxation.” Each relaxation takes hundreds of hours to complete on a multi-core machine.
ML has the ability to significantly accelerate this process, enabling the study’s researchers to replace DFT simulations that currently take hours or days with ML predictions that take a matter of seconds. However, in order for these ML models to work, they must be trained on a dataset that matches DFT predicted configurations or energies. To build the new OER dataset, Meta AI partnered with Associate Professor Zachary Ulissi and other experts at CMU to determine which materials should be included and run DFT calculations to create baseline models.
OER is a critical electrochemical reaction for hydrogen production and the intermediate steps involved in that process. Limited by the availability of existing, expensive precious metal oxides, like Ruthenium and Iridium oxide, researchers’ need for efficient, low-cost catalysts for OER has become more pressing. The collaboration’s new dataset enables researchers to train and build ML models that will quickly identify these low-cost oxide catalysts.
Improved catalysts for OER will advance several renewable energy technologies, such as solar and wind fuel production, as well as rechargeable metal-air batteries, a renewable energy storage device used in electric vehicles.
With this new upcoming open-source dataset release, the team's researchers hope to spur scientific progress, helping others overcome the computational limits of previous methods. More broadly, they hope their work will help the computational chemistry community discover promising new materials at scale.
A version of this article first appeared on Meta AI's website on April 18, 2022.