Memo on parser
Note: Memo on parser before refactoring was moved to https://github.com/aiida-vasp/aiida-vasp/wiki/Memo-on-parser-before-refactoring.
(AiiDA core responsibility)
-
CalcJob
is finished. -
parse
method in the set parser is called. One can set the parser by specifying themetadata.options.parser_name
as aStr
input of theCalcJob
.
(Responsibility is now handed over to the plugin)
- Before we can execute the
parse
function on the plugin side, the parsing class which houses this function (VaspParser
in this case) need to be initialized. This involves the following steps:- We first map a function
get_quantity
to aDelegate()
class. The goal when constructing the parser was to be as general as possible such that we could configure the parser to compose a quantity that was dependent on for instance different file parsers. The other option would be to manually add all combinations, but since this would introduce code duplication theDelegate()
approach was chosen. In addition, we wanted these quantities to be easy to extend and configure. The idea being that users could add custom file parsers and not touch the core parser code. Then the composition is configured with the parser settings. For the isolated VASP case this is certainly overkill, but makes it possible to reuse this parser for other plugins and maybe more importantly yields the possibility to parse results of auxiliary codes such as Wannier90 with the same core engine. In fact, the parser was initially constructed with the aim of making a general parser for AiiDA. Notice that even though theDelegate()
approach was chosen, there are other ways to archive similar functionality. - The
settings
are initialized using theParserSettings
class. This will contain all relevant settings, including for instance which file parser are associated with physical files, if some files are critical etc. In addition it will house which quantities end up on which output nodes and their respective keys. - Then the
quantities
are initialized using the classParsableQuantities
. This will contain which quantities we can parse, if files are missing to parse the requested quantities, and also important, alternative parsers. E.g. if one typically fetches a parameter fromfileA
one can specify that one alternatively can parse it fromfilaB
. If sayfileA
then is not present or there is some other issue with its file parsing, it parses it fromfileB
and so on. It is only initialized at this point. Whenparse
is executed, this will callquantities.somemethods
that handle and set these properties. - The file
parsers
are initialized using theParserManager
class. This basically sets which file parsers class to physical file mapping and checks that the file is there etc. It is again only initialized and later calls toparses.somemethods
are performed after theparse
is executed to actually perform these tasks.
- We first map a function
- The
parse
inVaspParser
is executed and the actual parsing starts. - First a few checks of missing critical files are performed. If a critical file is not found an exit code is returned.
- Then
quantities.setup
is executed, which
Parsing starts with VaspParser
. The parsing from AiiDA is triggered by calling the parse
method in VaspParser
. This intrinsic AiiDA functionality. When parse
completes, the parsing should be completed and
DEFAULT_OPTIONS = {
'add_trajectory': False,
'add_bands': False,
'add_chgcar': False,
'add_dos': False,
'add_kpoints': False,
'add_energies': False,
'add_misc': True,
'add_structure': False,
'add_projectors': False,
'add_born_charges': False,
'add_dielectrics': False,
'add_hessian': False,
'add_dynmat': False,
'add_wavecar': False,
'add_forces': False,
'add_stress': False,
'add_site_magnetization': False,
'store_energies_sc': False,
}
FILE_PARSER_SETS = {
'default': {
'DOSCAR': {
'parser_class': DosParser,
'is_critical': False,
'status': 'Unknown'
},
...
The dict key of FILE_PARSER_SETS['default']
is accessed by a file name obtained from parser.retrieved
, e.g., each of [retrieved_file.name for retrieved_file in parser.retrieved.list_objects()]
.
NODES = {
'misc': {
'link_name': 'misc',
'type': 'dict',
'quantities': ['total_energies', 'maximum_stress', 'maximum_force', 'symmetries', 'magnetization', 'notifications']
},
'kpoints': {
'link_name': 'kpoints',
'type': 'array.kpoints',
'quantities': ['kpoints'],
},
'structure': {
'link_name': 'structure',
'type': 'structure',
'quantities': ['structure'],
},
'poscar-structure': {
'link_name': 'structure',
'type': 'structure',
'quantities': ['poscar-structure'],
},
...
NODES.keys()
(or Settings.output_nodes_dict.keys()
) are identifiers locally used (here we call it node_name
and it seems node_name
will not be stored in AiiDA database.) NODES[node_name]['link_name']
is the AiiDA link label. Each element of NODES[node_name]['quantities']
corresponds to one of those given by 'alternatives'
in PARSABLE_ITEMS
and also ParsableQuantities()._parsable_quantities.keys()
.
NODES_TYPES = {
'dict': ['total_energies', 'maximum_force', 'maximum_stress', 'symmetries', 'magnetization', 'site_magnetization', 'notifications'],
'array.kpoints': ['kpoints'],
'structure': ['structure'],
'array.trajectory': ['trajectory'],
'array.bands': ['eigenvalues', 'kpoints', 'occupancies'],
'vasp.chargedensity': ['chgcar'],
'vasp.wavefun': ['wavecar'],
'array': [],
}
DEFAULT_OPTIONS = {
'quantities_to_parse': [
'structure', 'eigenvalues', 'dos', 'bands', 'kpoints', 'occupancies', 'trajectory', 'energies', 'projectors', 'dielectrics',
'born_charges', 'hessian', 'dynmat', 'forces', 'stress', 'total_energies', 'maximum_force', 'maximum_stress'
],
'energy_type': ['energy_no_entropy']
}
The items of 'quantities_to_parse'
are used to access the kyes of PARSABLE_ITEMS
.
PARSABLE_ITEMS = {
'structure': {
'inputs': [],
'name': 'structure',
'prerequisites': [],
'alternatives': ['poscar-structure']
},
...
self._parsable_items = self.PARSABLE_ITEMS
. This can be accessed as the attribute parsable_items
of the file parser instance (@property
). 'name'
corresponds to elements of NODES
items' 'quantities'
. Elements of 'prerequisites'
and 'alternatives'
correspond to keys of parsable_items
.
PARSABLE_ITEMS = {
'poscar-structure': {
'inputs': [],
'name': 'structure',
'prerequisites': [],
},
}
-
node_key
(node_name
) :NODES.keys()
,add_xxxx
inDEFAULT_OPTIONS
-
quantity_items
:for quantity_key, quantity_dict in quantity_items.items()
-
quantity_name
:quantity_dict['name']
,NODES[node_key]['quantities']
-
quantity_key
:PARSABLE_ITEMS.keys()
,quantity_dict['alternatives']
,quantity_dict['prerequisites']
-
quantity_dict
:PARSABLE_ITEMS[quantity_key]
- In the original implementation we preloaded
OUTCAR
andvasprun.xml
etc. Now the file parser is loaded for every quantity. We should consider preloading - In addition, we should consider to release memory when the file parser is no more needed (but the parser needs to continue with other file parsers)
-
show_screening_steps
should possibly be integrated with the AiiDA debug settings etc. - Try to make the node composer concept more general, such that it is possible to have several dict nodes etc.
-
get_node_inputs_from_file_parser
is needed only for tests? - Consider to simplify the file parsers, e.g. the
init
stuff. In addition, see if we can move more into theBaseFileParser
and make each file parser simpler. - Consider to remove
BaseParser
from file parser module. - Consider to not bring the exit codes along into the parsing and use another container for errors (say the
notifications
or a more general one) and then in sayparse
return one exit code that we define on the calculation, where we can for instance update the message and introduce text from the notification and return it. - Consider
get_quantity_from_input
etc. too check if there is use for it and change it to comply with the new standard. - Decide how to handle composed calculations utilizing different folders (say that VASP expects results or ejects results in different folders)