Learning atomistic neural network potentials (NNP) using meta-learning. Final project for Stanford's CS330: Deep Multi-Task and Meta-Learning.
You can use conda or pip to run the code in this repository. For conda, run the following commands:
conda env create -f environment.yml
conda activate meta-learn-force-fields
For pip, first create and activate a virtual environment:
python -m venv venv/
source venv/bin/activate
and then install all of the requirements:
pip install -r requirements.txt
This project uses the ANI-1 datasets. You can download the data by running:
python download.py
We provide a script for preprocessing the ANI-1 dataset file into ANI-1x and ANI-1ccx, where each dataset is cleaned of NaN values and extraneous data. To preprocess the data, run:
python preprocess.py
This will generate two HDF5 files, one for ANI-1x and another for ANI-1ccx. Each HDF5 file is structured with molecule names as groups at the top level. Each group contains datasets with the attributes atomic_numbers
, coordinates
, and energy
. You can access the data for a specific molecule in your code as follows:
import h5py
import numpy as np
...
dataset = '1x'
molecule_name = 'C1H1N1'
with h5py.File(f'data/ani{dataset}.h5') as f:
molecule = f[molecule_name]
atomic_numbers = np.array(molecule['atomic_numbers'])
coordinates = np.array(molecule['coordinates'])
energy = np.array(molecule['energy'])
...
You can train the model by running the following:
python train.py
There are several adjustable parameters that you can pass into the command. To learn more about each of these, run python train.py --help
.