Skip to content

NWChem/input-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Description

A script to generate NWChem input files for use in performance and scalability experiments

History

This script was created when I was working on this paper and was hardened significantly when I was working at Argonne.

J. R. Hammond, N. Govind, K. Kowalski, J. Autschbach and S. S. Xantheas, J. Chem. Phys. 131, 214103 (2009). Accurate dipole polarizabilities for water clusters N=2-12 at the coupled-cluster level of theory and benchmarking of various density functionals

Please note that this script is not particularly Pythonic, because the author's approach to Python is pragmatic, which is a polite way of saying "lazy".

Usage

If you run the script without arguments, it prints a helpful message, but don't start generating input files yet.

Usage: ./make_nwinput.py <cluster> <method> <basis> <task>

<cluster> can be: w1 w2 w3 w4 w5 w6cage w6book w6prism w6cyclic w7 w8s4 w8d2d w9
                  w10 w11i434 w11i4412 w11i443 w11i515 w11i551 w12 w13 w14 w15 w16
                  w17int w17surf w18 w19 w20dode w20fused w20face w20edge w21
                  rubrene or wN for any N not already listed.

<method> can be pbe0, b3lyp or other supported functional represented by a single string
                d-hfx      - direct SCF using the DFT module
                d-scf      - direct SCF using the SCF module
                sd-scf     - semidirect SCF using the SCF module
                d-mp2      - direct MP2
                sd-mp2     - semidirect MP2
                ri-mp2     - resolution-of-identity MP2 (must use Dunning basis)
                "rccsd(t)" - partial-direct RHF-CCSD(T) (does not use poing-group symmetry)
                "ccsd(t)"  - canonical ROHF-CCSD(T) via TCE (expoits D2h and subgroups thereof)
                <r>ccsd-t  - (same as above but avoides parentheses in file names)
                <tce>      - any method with cc, mbpt or cis in the name will be treated as a TCE method

<basis> can be 6-31[1][++]G[**] and [aug-]cc-p[c]v[*]z
               (where s = * and p = + because otherwise input files are difficult to deal with)

               Note: There is no pre-defined RI basis for Pople basis sets,
                     so that will not be configured automatically, whereas
                     it will be for Dunning basis sets.

<task> can be energy, optimize, frequency, etc.

Tuning for your machine

At the top of the script are some parameters that you may need to change for each machine. Alternatively, you can leave them and edit the generated input files. The latter allows for more fine-tuning, and is essential if you want to run different versions of the same input, because you'll need to use different subdirectories if the prefix is the same.

################################################
#                                              #
# MACHINE-DEPENDENT CONFIGURATION INFORMATION  #
#                                              #
################################################

# this is probably reasonable on a system with 4 GB per MPI process
# (assuming running 1 MPI per core, which is not always optimal)
stack_mem=1500
heap_mem=100
global_mem=1500

# Do not store semidirect CCSD integrals on disk.
# This is appropriate if your CPU is much faster than your filesystem.
nodisk = False

# Use OpenMP support in semidirect CCSD(T).
# You must compile your binary with USE_OPENMP for this to be effective.
openmp = True

# these are the paths where you job will write files
# this is the directory where the RTDB and MOVECS files will be written.
# in many cases, it is reasonable to have this path be in your home directory.
# the filesystem on which this directory is located must be shared (e.g. NFS, GPFS, Lustre)
permanent_dir = '.'
# the scratch disk is treated like local disk.
# on almost all machines, it should be the local scratch disk on the node.
# exceptions to this rule are Blue Gene and Cray systems, which either have
# no local disk or the local disk (on Cray, /tmp) should not be used since
# it (1) is small (2) is slow (3) will kill the node if it fills up.
scratch_dir   = '/tmp'

Small Examples

These job will run probably run on a laptop and definitely on a workstation.

DFT

DFT is a relatively inexpensive method.

./make_nwchem_input.py w5 b3lyp cc-pvtz energy

Semidirect CCSD(T)

Semidirect CCSD(T) is limited in functionality but is more efficient than the TCE for molecules without symmetry and uses a lot less memory since the full set of two-electron integrals is not stored.

./make_nwchem_input.py w3 "rccsd(t)" 6-31G energy

TCE CCSD(T)

TCE supports symmetry and a wide range of methods, but since CCSD(T) is usually what is used for benchmarking, that is all our script tries to support.

The TCE input files generated by this script use a pretty good set of options, certainly better than what most users guess. Please read the documentation for more information.

./make_nwchem_input.py w3 "ccsd(t)" 6-31G energy

Bigger Examples

There are two parameters that make jobs do more computation: the molecule considered and the basis set. This script focuses on water clusters, which provide a set of molecules that range from trivial to heroic to study with various methods. For example, 24 water molecules with CCSD(T) and triple-zeta basis (modified cc-pVTZ) was a Gordon Bell Prize finalist a few years ago.

You need to have some experience with quantum chemistry methods to understand the precise scaling, but here is a rough guide:

  • One water molecule has 8 valence electrons and 10 total. For DFT and SCF methods, the latter matters. For MP2 and CCSD(T) methods, the former matters.
  • You can determine how many basis functions per water molecule for each basis by running SCF with one water.

SCF and DFT scale as O(R^4) for small systems and O(R^2) to O(N^3) for large systems, where N is the total number of electrons and R is the total number of basis functions.

MP2 scales as O(NR^4). RI-MP2 has a smaller prefactor than semidirect MP2 because the bottleneck kernel is DGEMM rather than atomic integral evaluation.

CCSD(T) scales as O(N^3 R^4). Semidirect CCSD(T) requires O(N^2 R^2) global memory, whereas TCE requires O(R^4). The triples evaluation uses O(T^6) local memory, where T is the tilesize, which is a user-controllable parameter. A tile size of more than 24 will cause CCSD(T) to segfault in most cases.

About

A script to generate NWChem input files for use in performance and scalability experiments

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published