Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User friendly support and logic for additional Fleur inputfiles in Fleurinpdata #89

Open
broeder-j opened this issue Sep 17, 2020 · 15 comments

Comments

@broeder-j
Copy link
Member

Some files should become part of fleurinp data some not, depending on the usecase.
If files have to be modyfied, be consistent with inp.xml and sometimes not copied, it may make sense to add them and provide further userfriendly methods for validation and modification.

Here we collect:

  • nmat file: For LDA+U, not created by inpgen, only by fleur reused in further runs, user sometimes provides a first guess (provide some initialize function), parsing not nessecary

  • sym.out file: external symmetries, important for GW (output only by inpgen), ggf need to be parsed

  • relax.xml: file for relaxation with forces and new positions. is written by fleur and reused by fleur. ggf need to be parsed

  • enpara file?: for certain modes one can provide this, but as far as I now, one does not change it.

@anoopkcn, @Tseplyaev something to add here?

@Tseplyaev
Copy link
Collaborator

Tseplyaev commented Sep 17, 2020

I have just fixed a bug that did not allow one to validate *.xml files within an inp.xml (commits db33541 and b0c8622).
Now if one adds additional *.xml files to a FleurinpData instance that will be included into the inp.xml by XInclude,
the XInclude will be performed and the resulting inp.xml will be validated against schema.
In contrast, if you just add *.xml file by side which is not included into the inp.xml by XInclude, the file will not be validated.

For example, if one initialises FleurinpData:

from aiida_fleur.data.fleurinp import FleurinpData
FleurinpData(files=['inp.xml', 'relax.xml'], node=folder_data_node)

then the relax.xml is inserted into a copy of inp.xml and the copy is validated. Note: both of the files are not changed in the FleurinpData instance during the validation.

Should it work for sym.outt as well since it can be included into inp.xml via XInclude?

Later addition: I have realised that now it works for relax.xml only. I opened an issue for this #91

@anoopkcn
Copy link
Contributor

sym.out is not created with the -explicit option if the 'old' inpgen. The new inpgen will create this file with command-line option even with -explicit (i have to check the details). If it does then it is a matter of adding this file to the retrieve-list and remote-copy list. So I think Xinclude the sym.out file is not necessary.

@broeder-j
Copy link
Member Author

broeder-j commented Sep 17, 2020

sym.out is not created with the -explicit option if the 'old' inpgen. The new inpgen will create this file with command-line option even with -explicit (i have to check the details). If it does then it is a matter of adding this file to the retrieve-list and remote-copy list. So I think Xinclude the sym.out file is not necessary.

Why did they change this? shouldn't the default behavior be that '-explicit' puts everything in the inp.xml (no sym.out file) and not using explicit or with some other command will give you a sym.out file?

@anoopkcn
Copy link
Contributor

But I have to talk to @gregor about this. Another option is to parse the sym.out data from the inp.xml file. Which is what I'm doing now with aiida-spex but it's not an elegant solution since I need XML routines just for doing this and not used anywhere else.

@broeder-j
Copy link
Member Author

I have checked also some time ago if inpgen finds the same symmetries spglib does. Either way I think routines to parse the symmetries are not bad to have.

@Tseplyaev
Copy link
Collaborator

@anoopkcn all the contents should be already parsed and stored in the FleurinpData attributes:

In [10]: random_fleurinp = load_node(51839)                                                                                                                                                              

In [11]: random_fleurinp.attributes['inp_dict']['cell']['symmetryOperations']                                                                                                                            
Out[11]: 
{'symOp': {'row-1': '-1 0 0 .0000000000',
  'row-2': '0 -1 0 .0000000000',
  'row-3': '0 0 1 .0000000000'}}

@anoopkcn
Copy link
Contributor

That's a good suggestion. Then all I have to do is to find the corresponding fleurinp data. I will try this.

@Tseplyaev
Copy link
Collaborator

Tseplyaev commented Sep 18, 2020

There is no reason of finding it, you can just make it on-fly without storing it in the database:

random_fleurinp = = FleurinpData(files=['temp\inp.xml'])

It is actually the way to parse the content of an inp.xml file - just pass it to the FleurinpData initialiser and it will do the job.

@janssenhenning
Copy link
Contributor

janssenhenning commented Oct 15, 2020

It might also be useful to provide support for removing additional files via the FleurinpModifier (Obviously removing the inp.xml would make no sense at all). For example I noticed that when you add a n_mmp_mat file to a FleurinpData for a scf workchain the original n_mmp_mat will be uploaded for all runs and the density matrix is always reset to the initial guess. This way you lose all the convergence again. This could be solved by either constructing a new FleurinpData from just the inp.xml or removing the n_mmp_mat in some way via the Fleurinpmodifier for the subsequent runs.

@Tseplyaev
Copy link
Collaborator

@janssenhenning there is the del_file method that might help. Note that this method modifies existing FleurinpData and you have to clone the stored FleurinpData in order to modify it.

In [1]: from aiida_fleur.data.fleurinp import FleurinpData                                                    

In [2]: f = FleurinpData(files=['inp.xml', 'relax.xml'])                                                      

In [3]: f.store()                                                                                             
Out[3]: <FleurinpData: uuid: b6ad8222-2922-4b3e-b5ce-6476c4ffe53e (pk: 58287)>

In [4]: f.del_file('relax.xml') 
Out[3]: ...
ModificationNotAllowed: cannot modify the repository after the node has been stored  

In [5]: w = f.clone()                                                                                                                       

In [6]: w.del_file('relax.xml')                                                                                                             

In [7]: w.files                                                                                                                             
Out[7]: ['inp.xml']

@janssenhenning
Copy link
Contributor

@Tseplyaev Thanks that's the thing I'm looking for. But I have to think about how one would implement this in the scf workchain, as the fleurinp is only modified in the beginning before the first calculation if I understand it correctly

@broeder-j
Copy link
Member Author

@janssenhenning there is the del_file method that might help. Note that this method modifies existing FleurinpData and you have to clone the stored FleurinpData in order to modify it.

But still it might make sense to expose this also to fleurinpmodifier. Because if you do clone, delete a file and store you loose the provenance. fleurinpmodifier does all modifications at once, therefore the file removal should be in there.

@broeder-j
Copy link
Member Author

broeder-j commented Oct 15, 2020

It might also be useful to provide support for removing additional files via the FleurinpModifier (Obviously removing the inp.xml would make no sense at all). For example I noticed that when you add a n_mmp_mat file to a FleurinpData for a scf workchain the original n_mmp_mat will be uploaded for all runs and the density matrix is always reset to the initial guess. This way you lose all the convergence again. This could be solved by either constructing a new FleurinpData from just the inp.xml or removing the n_mmp_mat in some way via the Fleurinpmodifier for the subsequent runs.

we could also say in run_fleur(self) not to copy the n_mmp_mat to the remote i.e remove it from the 'local_copy list' using the settings node. In the first run it has to be copied, i.e one should check the loop count. How do you upload the other one?

But prob. better implement it in the FleurCalculation that it is not copied from fleurinp if a RemoteData node from a FleurCalc is given.

389            allfiles = fleurinp.files
390            for file1 in allfiles:
391                local_copy_list.append((fleurinp.uuid, file1, file1))
...
428            if has_fleurinp:
429                #The n_mmp_mat file from fleurinp takes priority
430                has_nmmpmat_file = has_nmmpmat_file and self._NMMPMAT_FILE_NAME not in fleurinp.files

i.e Per default all files from fleurinp are always added.
@janssenhenning why did you wrote 430, i.e fleurinp has prior over remote? This creates the behavior right?
If you increase the logging level of the AiiDa daemon to INFO, we log the copy file lists. Also they are in the .aiida folder in the jobcalc folder on the machine.

@janssenhenning
Copy link
Contributor

janssenhenning commented Oct 15, 2020

@broeder-j settings would need an additional key 'remove_from_local_copy_list' for this right?

line 430 is only used for versions without hdf5 (see lines 437-442).

437                if with_hdf5:
438                    copylist = self._copy_scf_hdf
439                elif has_nmmpmat_file:
440                    copylist = self._copy_scf_ldau_nohdf
441                else:
442                    copylist = self._copy_scf

It is not responsible for the behaviour I'm describing. In the case for HDF5 it would be enough to just remove n_mmp_mat from the local_copy_list, because the n_mmp_mat is not necessary to continue a calculation
For versions without HDF5 this could become a problem because in principle the n_mmp_mat file has to always be copied for the next iteration.

@janssenhenning
Copy link
Contributor

@broeder-j But yes for calculations without HDF5 where remote_data and a fleurinp is specified it could make sense that the n_mmp_mat in remote_data takes prio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants