Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify species type extension #264

Open
wants to merge 4 commits into
base: upcoming-2.0.0
Choose a base branch
from

Conversation

s9105947
Copy link

@s9105947 s9105947 commented Jan 27, 2022

This PR clarifies the species type extension, adds a full definition using ABNF.
Technically it introduces breaking changes -- for details, please see below.

Issues: #256 #257 #258 #259 #260 #261

Description

  • Species Type: Full Grammar of some Ideas #256 Full ABNF grammar
    • I added a full definition using ABNF in every section.
    • In the future the rule for elements might have to be updated.
    • Examples are still the main way readability is ensured -- the grammar is intended as an auxiliary tool if the textual description is not sufficient.
  • "single species" vs. "multiple species" #257 single species and multiple species
    • There is a clearer distinction made between a single and a list of particle species.
    • Breaking: other: is not allowed in lists, as to allow it the semicolon ; would have to be forbidden from its content.
    • Separator placement is clarified: Double list separator forbidden (item1;;item2), trailing separator optional.
    • Clarified that a list with one particle species and a single particle species must be treated equally -- as they can't (necessarily) be distinguished/used the same syntax.
    • We discussed encoding (relative) quantities included in lists. This is not included.
  • Species Type: Character Set after "other:" #258 charset after other:
    • Clarified that only printing ascii characters (0x21-0x7e), as well as space are allowed.
    • Breaking: Control characters are now forbidden. Note that newline (ascii 0xa) and horizontal tab (ascii 0x9) are thereby forbidden too.
    • Added note discouraging use of other: (without suffix) as well as trailing spaces -- even though they are allowed.
  • Species Type: Clarification on Atom Syntax #259 clarify atom syntax
    • Clarified that elements must be case sensitive.
  • Species Type: Suggestion for Ion Syntax #260 ion syntax
    • Proposition of ion syntax has been dropped.
    • Clarified that ions are not supported by the base standard.
    • Added note discussing when charge state SHOULD NOT and when it MAY be encoded by an implementation.
  • Species Type: Clarifications on Molecule Syntax #261 molecule syntax
    • Proposition for molecule syntax has been dropped.
    • Clarified by rewriting paragraph.

In its entirety, this aims to clarify the current version of the standard.
Except for the treatment of other:, all unambiguous parts have been left untouched.

Valid examples:

electron
He
#2H
electron;H
electron;H;
Si;
other:
other:myitem
other:this looks like a list;but is actually a single particle species
other: note that this text can be very long
other: there is no \n escaping

Invalid examples:

species invalid because
other:· non-ascii char ·
electron;;H empty list item/double semicolon ;;
Uux not part of the periodic table
H2O molecules not supported

Affected Components

  • EXT-SpeciesType

Logic Changes

  • other: forbidden in lists
  • unprecise description for molecules removed, such implementations would be non-conformant now

Writer Changes

  • Forbid non-printing ascii in speciesTypes
  • ensure lists do not contain other:
  • ensure that particle species lists with one item are equivalent to single particle species

Existing writers are compatible, as long as they

  • do not use non-printing ascii chars for speciesType

Reader Changes

Lists with one item must be considered single particle species.

As molecules and ions have been explicitly forbidden, reader implementations are encouraged to specify their additionally accepted syntax explicitly.
(Which they should do anyways).

(I did not encounter any other mentions of speciesType in the related tools.)

Data Converter

Not implemented, hence no changes necessary.

@s9105947
Copy link
Author

s9105947 commented Jan 31, 2022

Following discussion in #262 I replaced RFC 2119 terms (MUST, SHOULD NOT...) with bold text, but using lowercase.
I can remove the emphasis, though I'm a little reversed towards that -- I used the subtle differences between must (not) and should (not) deliberately, and kinda fear that being lost. Though diverting the style from the remaining standard must be avoided.

@ax3l ax3l requested review from ax3l and DavidSagan February 10, 2022 16:39
@ax3l ax3l added the EXT: SpeciesType physical particle species extension label Feb 10, 2022
@ax3l ax3l self-assigned this Feb 10, 2022
Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you already!

I marked a few typos and will do a thorough review soon!

EXT_SpeciesType.md Outdated Show resolved Hide resolved
EXT_SpeciesType.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@DavidSagan DavidSagan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work done by everyone!

Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
@s9105947
Copy link
Author

If I read the openPMD validator correctly (do I?), currently only non-empty strings consisting of a-zA-Z0-9:;- are accepted as SpeciesType. Though this is not documented in the current standard, should we put it into the current version?

My main idea of using printing ASCII chars was to allow the broadest, "non-harmful" charset to not break compatability. But I guess that the openPMD validator is also somewhat authoritative, so presumably nothing would be broken by demanding a-zA-Z0-9;:-?

@ax3l
Copy link
Member

ax3l commented Jan 25, 2023

I think that makes sense, if one uses the speices type extension then one should actually specify a SpeciesType.
(We have special keywords like None for some attributes, but I am not sure it makes sense here.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EXT: SpeciesType physical particle species extension
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants