Thermo Estimation

Where to look for thermodynamic estimation code

Most of the code that actually does the estimation is in rmgpy/data/thermo.py
- liquid-phase: rmgpy/data/solvation.py
- QMTP: rmgpy/qm
As far as I can tell, interaction with parameters in databases mainly happens in rmgpy/data. Maybe the qm module (or some of it) should be moved to data.
The relevant database files are in RMG-database/input/
- thermo/libraries/ : Both gas-phase and liquid-phase libraries (don't think we have any yet)
- thermo/groups/ : Several different group contribution values for gas-phase thermo
- solvation/libraries/ : Contain Abraham LSER values for full solutes and solvents
- solvation/groups/: Group contribution values to calculate Abraham LSER values

Components

Gas phase thermo libraries

Libraries are in RMG-database/input/thermo/libraries/. They are used preferentially, in the order they are specified in the RMG input file. Thermo is looked up in rmgpy/data/thermo/getThermoDataFromLibraries(). In this function, we differentiate between gas-phase thermo libraries and liquid-phase thermo libraries. More on that below.

Group additivity for polycyclics, heteroatoms, etc.

Benson group additivity is used.
Polycyclics

Hydrogen bond increment theory - what we do with radicals

See this paper

Symmetry

An external program in external/symmetry/symmetry.

As far as I can tell, we only use symmetry to calculate the point group of molecules. The point group is used to calculate chirality, which is used to calculate entropy. From my experience, sometimes Symmetry gets the point group wrong.

QMTP

Calculate the thermodynamics parameters using quantum mechanics. Useful when unknown thermo parameters are poorly estimated using group additivity (we often turn it on for only cyclics).

rmgpy/qm/main/getThermoData() creates a QM object and generates thermo data using either MOPAC (PM3, PM6, PM7) or Gaussian (PM3, PM6). Then things get a bit convoluted:
- rmgpy/qm/molecule/generateThermoData(): this generates QMdata, determines the point group, and calculates thermo data.
  1. rmgpy/qm/molecule/generateQMData(): this is not implemented in the generic molecule file but is inherited by either Mopac or Gaussian modules, because depends on different running, parsing etc. of the different quantum chemistry packages. So, this is the method (and methods within it) that need to be written when a developer wants to add support for a different package like NWChem, QChem etc...
  2. rmgpy/qm/molecule/determinePointGroup(): calls the Symmetry program. See above. Requires that step 1 was successfully completed.
  3. rmgpy/qm/molecule/calculateThermoData(): Requires that steps 1 and 2 were completed and relies heavily on the statmech module. But also does some things in this module itself, like calculating the chirality constant.
How did we get here? We called it in rmgpy/data/thermo/getThermoData(). Yes, it's extremely confusing that these methods have the same name.
Why is some stuff done directly in rmgpy/qm/molecule.py, others in Symmetry, and yet others in rmgpy/statmech? Having statistical mechanics in a separate statmech file structure makes sense, and symmetry is an outside program which we do not maintain, but the rmgpy/qm modules themselves could use some restructuring.

Solvation thermodynamics

Thermodynamics for an individual solvated species starts usually starts with the gas-phase thermodynamics calculation for the same species (except for when we use a liquid thermo library as below.) The solvation thermo, which is (usually) added to the gas-phase value, is estimated with a linear solvation energy relationship (LSER). Like gas-phase thermo, the solute parameters needed for this relationship can either be found in a library (in RMG-database/input/solvation/libraries/solute.py), or by group additivity. The group values are atom-based (like gas-phase Benson groups) (in RMG-database/input/solvation/groups/abraham.py) or group-based (in RMG-database/input/solvation/groups/nonacentered.py). Solutes with unpaired electrons are treated in the same way as gas-phase species; there is a set of corrections (in RMG-database/input/solvation/groups/radical.py). The solvent parameters are only found in a library (in RMG-database/input/solvation/libraries/solvent.py).
All the above is for calculating enthalpy and Gibbs free energy of solvation at 298 K. To get Gibbs free energy of solvation at other temperatures, we assume delH and delS are constant and use linear extrapolation, but this breaks down pretty quickly. So further from 298 K we now use a different relationship and use the CoolProp module to get the necessary parameters.
Exception to the above is when the user specifies a thermo library in a specific liquid. Then it isn't necessary to add the gas phase thermo to the solvation thermo, just use the library out of the box. Solvent must be specified at the top of the library in the database. The libraries are sorted into liquid/gas in rmgpy/data/thermo/getThermoDataFromLibraries(). It's done here, because it depends whether this function is being run during thermo generation or during training set processing. If it's run during training set processing, the gas-phase thermo is still used, even if it's a liquid-phase job. That's because the reverse rate of the training reactions should not be calculated using liquid-phase thermo.
All of this happens in rmgpy/data/solvation.py, with the exception of adding in the solvation correction to the gas-phase thermo, which happens in the Thermo Engine (see Random).

Random

Thermo Engine (rmgpy/thermo/thermoengine.py): distributes your thermo calculations to different processors, depending on the system you're running on. Also, it's where all the thermo is converted to Wilhoit form, and the solvation correction added if applicable. Modifications to thermo estimation code elsewhere probably doesn't affect the thermo engine much.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly