A toolkit for data-driven force field design based on binary-encoded SMARTS
Highlights of this package are:
- Map and perform bitwise operations between two molecular substructures of arbitrary size
- Search/iterate a substructure at the SMARTS-primitive level, using both numerical and analytic approaches
- Cluster molecular data by SMARTS using a SMARTS hierarchy
- Calculate energy and gradients using a classical force field based on (a basic implementation of) the SMIRNOFF format
- Geometry optimization
- Force field parameter optimization (under development)
See the ChemRxiv preprint for the theoretical unpinnings on which this package is based.
Currently, the best way to install is to clone and then install with pip.
For environment users (e.g. venv or conda), one should probably create an empty environment first:
conda create -n besmarts python
conda activate besmarts
or
python -m venv besmarts
. besmarts/bin/activate
followed by the actual install:
git clone https://github.com/trevorgokey/besmarts besmarts-git
cd besmarts-git/besmarts-core/python
python -m pip install .
cd ../../besmarts-rdkit/python
python -m pip install .
RDKit is needed to decode SMILES into graphs and offers a faster implementation of SMARTS matching when labeling from a SMARTS hierarchy.
Geometry optimization uses the SciPy minimizer and can be installed using
using a similar process as above with besmarts-scipy
.
Molecular mechanics energy and gradient evaluations are implemented, but
require partial charges. By default, besmarts
will try to charge molecules
with am1bcc
using the sqm
program from ambertools
suite. Consequently,
make sure sqm
is in your PATH
by installing via conda
or by other means.
Documentation in this repository is hosted on RTD
Contributions in the form of bug reports, enhancements, and general discussions are very welcome. See CONTRIBUTING.md for more details.