Facebook today announced a partnership with Carnegie Mellon University on a research project — the Open Catalyst Project — that will leverage AI to accelerate the search for electrocatalysts, or catalysts that participate in electrochemical reactions. The goal is to enable scalable renewable energy storage by speeding up quantum mechanical simulations by as high as 1,000 times.
Renewable energy sources like wind and solar generate power intermittently and require storage to transfer power from times of peak generation to times of peak demand. Without technological advances, some researchers estimated the overall penetration of solar power, for example, is capped at 30%, with costs starting to rise substantially after 20% penetration.
Historically, batteries have been too expensive to scale; an alternative is using chemical reactions to convert energy into fuels like hydrogen and ethanol, enabling power to be efficiently stored for days, weeks, or months. But this process needs catalysts to drive the chemical reactions, the discovery of which can involve complex, time-consuming quantum simulations.
That’s where Open Catalyst 2020 (OC2020) comes in. OC2020 is the result of a year-long collaboration between Facebook and the research group of Professor Zachary Ulissi at Carnegie Mellon’s Department of Chemical Engineering. Focusing on molecules that are important in renewable energy applications, it’s a dataset compiled using Facebook’s datacenter-optimizing Optimus tool on spare compute cycles over the course of four months. OC2020 comprises over 1.3 million relaxations of molecular adsorptions onto surfaces — the collection of electrocatalyst structures to date.
Modern catalyst design taps simulation to determine if a material is suitable for further exploration. The simulation models the interaction of a molecule with a catalyst’s surface, a physical process called adsorption. The adsorbed molecules — i.e., adsorbates — are typically the key types involved in the reactions of interest, such as OH, O2, or H2O.
Assuming catalysts are created from up to three of the 40 known metals, there are nearly 10,000 combinations of elements. And because each combination must be tested by adjusting the ratios, the possibilities expand into the billions.
Current workflows allow scientists to try three or four possible catalyst combinations per year. Quantum mechanical simulation tools like those developed by Ulissi’s team can provide insight into roadblocks and focus efforts on the most promising catalyst candidates, but even modern computational laboratories struggle to exceed 40,000 simulations per year.
That’s because tools such as Density Functional Theory (DFT) use a process colloquially called “relaxation,” where they combine the locations of the atomic nuclei with quantum mechanics to predict the energy of the system and the forces acting on each atom. The locations of the nuclei are updated to minimize the energy, consequently changing the electronic distributions and energies. This iterative process continues until the energy of the system reaches a local minima; by examining the energy of the system with the lowest energy, researchers can get a sense of how much energy is needed to drive the reaction.
The process of relaxation is computationally complex and intensive, taking hours or even days per relaxation on high-end servers. It also scales poorly when the number of atoms is increased, with both longer computation times and an increased failure rate.
Facebook and Carnegie Mellon believe machine learning techniques might be the answer because of their ability to make quick, good approximations. By taking as input the state of a system including the atom positions, element types, bond information, and more, AI algorithms can predict system properties such as energy. Efficient DFT approximations, then, could make it possible to compute all potential catalyst surfaces and binding sites through brute force before they’re verified with traditional methods.
There’s reason to believe in the efficacy of AI approaches to catalyst discovery. Already, researchers have applied models to discover intermetallic surfaces and catalysts that transform waste carbon into commercially valuable products. Using AI to search for clean energy materials was explored at a 2017 workshop organized in collaboration with the Canadian Institute for Advanced Research. And Ulissi was one of several researchers to receive a $1.2 million grant from the U.S. Department of Energy in 2019 to use machine learning and data science to design more effective catalysts for chemical processing.
Within the purview of the Open Catalyst Project, Facebook and Carnegie Mellon say they’ve begun to experiment with using a small number of DFT calculations to train more efficient AI models on the physics governing quantum mechanics. Effectively, they’ve been teaching the models to approximate the energy and forces of molecules based on past data.
This research direction motivated the creation of OC2020, which Facebook and Carnegie Mellon describe as much larger and better-suited for the purpose than many existing datasets. Baseline models trained on OC2020 take between 12 and 72 hours to execute a relaxation with Optimus and Facebook’s servers, with each relaxation consisting of hundreds of smaller time steps. The goal is to eventually compute relaxations in seconds.
Speed isn’t the only factor. At each relaxation step, the forces at play on each atom in the system must be accurately predicted. Failure to do so means compounding errors until eventually the simulation bears little to no resemblance to reality. “A mistake on the scale of hundredths of an angstrom, a fraction of the size of an atom, might result in pursuing catalysts that are less efficient than we expected from our model — or worse, result in us overlooking a crucial breakthrough in electrocatalysis,” Larry Zitnick, a Facebook research scientist on the Open Catalyst Project, explains in a blog post. “Approximating DFT calculations poses an exceedingly difficult AI problem.”
Facebook and Carnegie Mellon say they hope that the Open Catalyst Project and the release of the dataset and models will inspire researchers in the broader community and jump-start efforts hindered by a lack of compute. Moreover, Zitnick postulates that the techniques applied to quantum interactions modeling problems might apply to challenges in water quality remediation, medical treatment discovery, advanced manufacturing, and geochemistry.
“We are determined to enable the community to build on our work and developments in an effort to advance the state of the art as quickly as possible,” Zitnick continued. “The Open Catalyst Project is committed to sharing our future AI models, baselines, and evaluation metrics, as well as any future datasets we create … If successful, this research has the potential to significantly accelerate the global shift towards renewable energy, removing the high costs associated with current electrocatalysts, providing a scalable alternative to expensive storage technologies like batteries, and supplying clean and sustainable power the world over.”