Unify metadata #1274

shivupa · 2023-09-04T18:46:43Z

          I'm more than happy to improve the spec and/or the [`avogadro-cclib` plugin](https://github.com/OpenChemistry/avogadro-cclib)

One thing I note, is that the metadata is really inconsistent across parsers.

Gaussian gives functional but Orca and GAMESS do not
Orca gives me a list of all keywords
NWChem gives me 'functional': 'Becke', for dvb_dispersion_bp86_d3zero
GAMESS gives me 'methods': ['RHF'] for dvb_dispersion_bp86_d3zero
etc.

I'd really like to be able to read an output file and then re-use those methods / keywords in the input generators.

Originally posted by @ghutchis in #1269 (comment)

The text was updated successfully, but these errors were encountered:

ghutchis · 2023-09-04T23:33:43Z

For example, it might be nice to separate dispersion from the functional:

'functional' : 'bp86',
'dispersion' : 'd3zero',
'methods': 'dft',

I like the idea of a string repeating all the keywords, since that may be useful to other programs / validation, etc.

oliver-s-lee · 2023-09-05T07:23:40Z

+1 for this whole endeavour, I think metadata is long overdue some loving attention :)

I like the idea of the keywords line, particularly because it doesn't require any strenuous parsing on cclib's side. Is there any desire to do anything more with keywords (splitting into separate keywords with their associated options, standardising short and long names for Gaussian etc)?

Dispersion should definitely be separate IMHO. We may also want to look at standardising functional names like we do for symmetry labels, eg to deal with PBE0 Vs PBE1PBE. Related and because it came up before, we don't currently parse functionals for ORCA because, weirdly, it doesn't report the functional using its common name anywhere.

This is ORCA's way of reporting PBE0:

------------
SCF SETTINGS
------------
Hamiltonian:
 Density Functional     Method          .... DFT(GTOs)
 Exchange Functional    Exchange        .... PBE
   PBE kappa parameter   XKappa         ....  0.804000
   PBE mue parameter    XMuePBE         ....  0.219520
 Correlation Functional Correlation     .... PBE
   PBE beta parameter  CBetaPBE         ....  0.066725
 LDA part of GGA corr.  LDAOpt          .... PW91-LDA
 Gradients option       PostSCFGGA      .... off
 Hybrid DFT is turned on
   Fraction HF Exchange ScalHFX         ....  0.250000
   Scaling of DF-GGA-X  ScalDFX         ....  0.750000
   Scaling of DF-GGA-C  ScalDFC         ....  1.000000
   Scaling of DF-LDA-C  ScalLDAC        ....  1.000000
   Perturbative correction              ....  0.000000
   Density functional embedding theory  .... OFF
   NL short-range parameter             ....  6.900000

And B3LYP:

------------
SCF SETTINGS
------------
Hamiltonian:
 Density Functional     Method          .... DFT(GTOs)
 Exchange Functional    Exchange        .... B88
   X-Alpha parameter    XAlpha          ....  0.666667
   Becke's b parameter  XBeta           ....  0.004200
 Correlation Functional Correlation     .... LYP
 LDA part of GGA corr.  LDAOpt          .... VWN-5
 Gradients option       PostSCFGGA      .... off
 Hybrid DFT is turned on
   Fraction HF Exchange ScalHFX         ....  0.200000
   Scaling of DF-GGA-X  ScalDFX         ....  0.720000
   Scaling of DF-GGA-C  ScalDFC         ....  0.810000
   Scaling of DF-LDA-C  ScalLDAC        ....  1.000000
   Perturbative correction              ....  0.000000
   Density functional embedding theory  .... OFF
   NL short-range parameter             ....  4.800000

ghutchis · 2023-09-05T16:28:22Z

As far as parsing functionals, I think it's easiest to do from the keywords line in Orca. I'm doing that now anyway.

I personally wouldn't attempt to standardize keywords. Honestly, a common use-case is "I want to re-run this calculation."

oliver-s-lee · 2023-09-07T11:18:39Z

Yeah I think parsing from the keywords is probably easier, but the downside is there's no way of automatically determining what's a functional name and what's a normal keyword. Do you just compare against a whitelist to extract the functional name?

Restarting calculations is a cool use-case, and yes in that case there's no need to transform keywords. One thing that's worth considering when we look to implement this are 'keywords' that appear in weird places. Eg Gaussian has a few options that have to appear after the geometry section, such as the ModRedun and gen/genECP sections.

ghutchis · 2023-09-07T17:31:55Z

whitelist to extract the functional name?

For now, yes. I'm open to better suggestions, but IMHO it's possible to cover a large percentage of cases with this, since a few functionals are the most popular.

berquist added this to the v1.8.1 milestone Sep 5, 2023

berquist self-assigned this Sep 5, 2023

berquist modified the milestones: v1.8.1, v2.0 Dec 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify metadata #1274

Unify metadata #1274

shivupa commented Sep 4, 2023

ghutchis commented Sep 4, 2023

oliver-s-lee commented Sep 5, 2023

ghutchis commented Sep 5, 2023

oliver-s-lee commented Sep 7, 2023

ghutchis commented Sep 7, 2023

Unify metadata #1274

Unify metadata #1274

Comments

shivupa commented Sep 4, 2023

ghutchis commented Sep 4, 2023

oliver-s-lee commented Sep 5, 2023

ghutchis commented Sep 5, 2023

oliver-s-lee commented Sep 7, 2023

ghutchis commented Sep 7, 2023