Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDF5 read failure in hdf_archive::read /format (QMCPack 3.17.1, cray-hdf) #4744

Open
svandenhaute opened this issue Sep 26, 2023 · 8 comments

Comments

@svandenhaute
Copy link

Describe the bug

I ran a simple PBE calculation in ORCA (cc-pVDZ basis), converted the .gbw file to molden using orca_2mkl, and then used molden2qmc to convert it into an .h5. I then generated a simple QMCPack input which is supposed to optimize the Jastrows.
The script crashes when it's reading the .h5 file -- any idea why?

HDF5 read failure in hdf_archive::read /format
Fatal Error. Aborting at Unhandled Exception
MPICH ERROR [Rank 0] [job id 4614716.0] [Tue Sep 26 17:33:00 2023] [nid005005] - Abort(1) (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0

I've attached the molden file, job script, input .xml, and converted .h5. The same happens when I use a different basis, or use the .gbw from a CCSD(T) calculation ... I'm expecting it to be something very stupid (like that the .h5 is empty or has the completely wrong format). I've appended the directory structure of the .h5 as well for convenience.
h5_structure.txt

To Reproduce
Steps to reproduce the behavior:

  1. execute the attached .xml

System:

  • LUMI; the build passed all deterministic tests.
  • modules:
module load LUMI/22.12
module load partition/G
module load buildtools
module load cray-fftw
module load cray-hdf5-parallel
module load libxml2/2.9.14-cpeAMD-22.12
module load Boost/1.81.0-cpeAMD-22.12

opt.zip

@svandenhaute
Copy link
Author

svandenhaute commented Sep 26, 2023

... though the h5 file structure looks entirely similar to e.g. this example in the QMCPack repository

EDIT: Molden can visualize the orbitals just fine as well. So it's either the actual numeric values in the H5 or really something internal in QMCPack...

@markdewing
Copy link
Contributor

It's failing to find "/format" here

H5File.read(format, "/format");

@ye-luo
Copy link
Contributor

ye-luo commented Sep 26, 2023

You requested spline orbital type but your h5 is in LCAO orbital type
From your input file.

 <sposet_builder type="bspline"

@markdewing
Copy link
Contributor

(For developers)
PR #4268 introduced some changes that altered the behavior in this case.
If read of "/format" and "/version" is changed to readEntry (which skips the error check for that entry), the error message becomes

  Reading 8 orbitals from HDF5 file.
  HDF5 orbital file version 0.0.0
ERROR EinsplineSetBuilder::ReadOrbitalInfo too old h5 file which is not in ESHDF format! Regenerate the h5 fileEinsplineSetBuilder::set_metadata Error reading orbital info from HDF5 file.
Fatal Error. Aborting at Unhandled Exception

This seems like it might be a little easier to diagnose the issue?

@svandenhaute
Copy link
Author

svandenhaute commented Sep 26, 2023

Ah got it. Given that this .XML was generated with the generate_qmcpack_input function from nexus, which arguments should I use to specify the format of the given .h5 (which I believe is either 'molecularorbital' or 'lcao', which may or may not be synonymous).

Currently, I'm using:

opt = generate_qmcpack_input(
        id='qmc',
        #identifier   = 'qmc',
        driver       = 'batched',
        #path         = 'dimer/opt',       # directory for opt run
        system       = dimer,             # run c20
        # input format selector
        input_type   = 'basic',
        orbitals_h5  = 'orbitals.h5',
        pseudos=['ecps/H.ccECP.xml', 'ecps/O.ccECP.xml'],
        # qmcpack input parameters
        corrections  = [],
        jastrows     = [('J1','bspline',8,6),   # 1 body bspline jastrow
                        ('J2','bspline',8,8)],  # 2 body bspline jastrow
        calculations = [
            loop(max = 6,                        # No. of loop iterations
                 qmc = linear(                   # linearized optimization method
                    gpu='yes',
                    energy               =  0.0,
                    unreweightedvariance =  1.0,
                    reweightedvariance   =  0.0,
                    timestep             =  0.5,
                    warmupsteps          =  100,
                    samples              = 16000,
                    stepsbetweensamples  =   10,
                    blocks               =   10,
                    minwalkers           =   0.1,
                    bigchange            =  15.0,
                    alloweddifference    =  1e-4
                    )
                 )
            ],
        )

@ye-luo
Copy link
Contributor

ye-luo commented Sep 26, 2023

(For developers) PR #4268 introduced some changes that altered the behavior in this case. If read of "/format" and "/version" is changed to readEntry (which skips the error check for that entry), the error message becomes

  Reading 8 orbitals from HDF5 file.
  HDF5 orbital file version 0.0.0
ERROR EinsplineSetBuilder::ReadOrbitalInfo too old h5 file which is not in ESHDF format! Regenerate the h5 fileEinsplineSetBuilder::set_metadata Error reading orbital info from HDF5 file.
Fatal Error. Aborting at Unhandled Exception

This seems like it might be a little easier to diagnose the issue?

Simply switching back to read() is still not significantly better. In that area of code, we need to do refine the error message based on the state of the code as of today.

@jtkrogel
Copy link
Contributor

Nexus typically reads the sposet xml created by the e.g. gamess or pyscf converters and then copies it into the generated input.

The type of input structure needed is similar to this example: https://github.com/QMCPACK/qmcpack/blob/develop/tests/molecules/FeCO6_b3lyp_pyscf/FeCO6.wfnoj.xml
The tricky bit might be getting the size of (AO?) coefficient list provided there.

For other developers: are the nested contents of really needed in this case? It seems like ground state occupation should be default and the coefficient basis size should be determined by reading the h5 file and not from input.

@svandenhaute
Copy link
Author

Thanks for the response. I'll just switch to PySCF in the meantime...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants