Software

Generalized Adsorption Simulator for Python: GASpy

I was the lead developer of GASpy, a tool designed to create, manage, and execute high-throughput Density Functional Theory (DFT) calculations across several supercomputers. I also built an active machine learning framework on top of GASpy. This active learning framework automatically selected and performed DFT calculations that were more likely to lead to discovery of high-performing catalysts. GASpy is built on top of Luigi, FireWorks, MongoDB, pymatgen, and TPOT.

Codebase: https://github.com/ulissigroup/GASpy
References: Original paper and perspective

Open Catalyst Project

Scientists at Facebook AI Research were looking for ways to use machine learning (ML) to save the world, and so we teamed up to form the Open Catalyst Project. Our goal was to “discover new catalysts for use in renewable energy storage to help in addressing climate change.” We worked towards this goal by creating one of the largest DFT datasets in the world for catalysts. We then trained on the dataset using a suite of state-of-the-art machine learning models from literature. This initial suite of models was used to kickstart an open challenge for researchers to improve upon.

On top of being part of the original team to scope and guide the project, I was a developer for the codebase used to create the dataset.

Website: opencatalystproject.org
Data-generating codebase: https://github.com/Open-Catalyst-Project/Open-Catalyst-Dataset
Machine learning codebase: github.com/Open-Catalyst-Project/ocp
Dataset: github.com/Open-Catalyst-Project/ocp/blob/master/DATASET.md