Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Custom Hierarchies #1432

Open
wants to merge 24 commits into
base: dev
Choose a base branch
from

Conversation

franzpoeschel
Copy link
Contributor

@franzpoeschel franzpoeschel commented May 3, 2023

TODO

  • Merge first: Remove necessity for RecordComponent::SCALAR #1154
  • Await Pybind11 release that has merged this fix: Introduce recursive_container_traits pybind/pybind11#4623
  • Implement custom groups at the Iteration level that can hold custom attributes
  • Implement custom datasets inside custom hierarchy
  • Implement openPMD-defined meshes/particles-data from anywhere in the hierarchy
  • Implement extended meshesPath/particlesPath
  • Update the openPMD standard, see Allow user to store non-openPMD information openPMD-standard#115 (comment)
  • Lenient parsing in CustomHierarchy class
  • Maybe lazy parsing of the custom hierarchy?
  • Use the new SharedAttributableData pattern to better implement variable-based encoding (where series.iterations and series.iterations[0] are the same backend objects)
  • Replace Iteration::meshes with Iteration::mesh("subdir/E") and Iteration::allMeshes() -> std::map<std::string, Mesh>, similar Iteration::species("subdir/e") and Iteration::allSpecies() -> std::map<std::string, ParticleSpecies>. But should it be species("subdir/particles/e") or species("subdir/e")?

Diff: https://github.com/franzpoeschel/openPMD-api/compare/topic-remove-scalar-component..topic-custom-hierarchies


private:
template <typename... Arg>
iterator makeIterator(Arg &&...arg)

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable arg is not used.
return iterator{this, std::forward<Arg>(arg)...};
}
template <typename... Arg>
const_iterator makeIterator(Arg &&...arg) const

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable arg is not used.
REQUIRE(r["x"].resetDataset(dset).numAttributes() == 0); /* unitSI */
// REQUIRE(r["y"].unitSI() == 1);
REQUIRE(r["y"].resetDataset(dset).numAttributes() == 0); /* unitSI */
// REQUIRE(r["z"].unitSI() == 1);

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
// unitSI is set upon flushing
// REQUIRE(r["x"].unitSI() == 1);
REQUIRE(r["x"].resetDataset(dset).numAttributes() == 0); /* unitSI */
// REQUIRE(r["y"].unitSI() == 1);

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
@@ -966,6 +968,27 @@
#endif
}

TEST_CASE("baserecord_test", "[core]")

Check notice

Code scanning / CodeQL

Unused static function Note

Static function C_A_T_C_H_T_E_S_T_36 is unreachable (
autoRegistrar37
must be removed at the same time)
Comment on lines 874 to 886
// for (auto it = this->container().begin(); it != end; ++it)
// {
// if (it->first == RecordComponent::SCALAR)
// {
// this->container().erase(it);
// throw error::WrongAPIUsage(detail::NO_SCALAR_INSERT);
// }
// }

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
Comment on lines 855 to 867
// for (auto it = this->container().begin(); it != end; ++it)
// {
// if (it->first == RecordComponent::SCALAR)
// {
// this->container().erase(it);
// throw error::WrongAPIUsage(detail::NO_SCALAR_INSERT);
// }
// }

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
@@ -1353,3 +1378,44 @@
UniquePtrWithLambda<int[]> arrptrFilledCustom{
new int[5]{}, [](int const *p) { delete[] p; }};
}

TEST_CASE("scalar_and_vector", "[core]")

Check notice

Code scanning / CodeQL

Unused static function Note

Static function C_A_T_C_H_T_E_S_T_60 is unreachable (
autoRegistrar61
must be removed at the same time)
@@ -156,6 +159,39 @@
}
}

TEST_CASE("custom_hierarchies", "[core]")

Check notice

Code scanning / CodeQL

Unused static function Note

Static function C_A_T_C_H_T_E_S_T_4 is unreachable (
autoRegistrar5
must be removed at the same time)
@franzpoeschel franzpoeschel force-pushed the topic-custom-hierarchies branch 2 times, most recently from c8a68a5 to 6c87958 Compare May 11, 2023 09:19
@franzpoeschel franzpoeschel force-pushed the topic-custom-hierarchies branch 2 times, most recently from 86d8a73 to 399e6cd Compare May 30, 2023 12:43
@@ -156,6 +159,129 @@
}
}

TEST_CASE("custom_hierarchies", "[core]")

Check warning

Code scanning / CodeQL

Poorly documented large function Warning

Poorly documented function: fewer than 2% comments for a function of 194 lines.
@franzpoeschel
Copy link
Contributor Author

franzpoeschel commented Jun 19, 2023

comment removed, updated version in comments below

@franzpoeschel franzpoeschel force-pushed the topic-custom-hierarchies branch 2 times, most recently from 8c28fab to 605bd55 Compare June 29, 2023 11:11
test/CoreTest.cpp Fixed Show fixed Hide fixed
@franzpoeschel
Copy link
Contributor Author

franzpoeschel commented Jul 13, 2023

For the meshesPath (equivalently for particlesPath), I have now implemented a prototype that does the following:

A path /data/0/custom/group/meshes/E is a mesh if the meshesPath contains any of the following:

  1. Full path to the group containing the mesh: /custom/group/meshes/
  2. Full path to the mesh itself: /custom/group/meshes/E
  3. Shorthand notation: meshes/

These options follow the following rules:

  • Full paths are denoted by a leading slash and are based on the data path (/data/%T)
  • Paths that don't specify the mesh itself, but instead its parent path (e.g. meshes/) are denoted by a trailing slash

Remarks:

  • The shorthand notation achieves backwards compatibility with old openPMD files
  • A shorthand notation for the meshes themselves (e.g. meshesPath = ['E', ...]) is not implemented, that is on purpose. Regexes can be used instead if someone really wants to have that semantic (.*/E).

@franzpoeschel
Copy link
Contributor Author

franzpoeschel commented Jul 13, 2023

One nontrivial design question is how to deal with the traditional openPMD hierarchy, especially with the paths /data/%T/meshes and /data/%T/particles. There is no definition of any form of physical data for those groups in the openPMD standard, a normal openPMD file contains no attributes /data/%T/meshes/<attr_name>.

This suggests to me that in the extended openPMD standard with custom hierarchies these paths should be treated as "nothing special". Rather, they become the canonical, but not mandatory layout/organization of a simple openPMD dataset.

Two somewhat tricky consequences from this point of view:

1. There might be more than 1 meshes paths in the same group
E.g. the paths /data/%T/meshes and /data/%T/images might exist side by side. In the openPMD standard, this is no problem, in the openPMD-api this becomes challenging.
The problem is with the member Iteration::meshes (made even worse by the fact that it's not a getter method, but a data member). Should it point to /data/%T/meshes? To a union of both? What about writing?

Imo, the best solution is to consider Iteration::meshes a shorthand API that should not be used in more complex setups. Rather, since /data/%T/meshes is now just another normal path in the custom Iteration hierarchy, one should access iteration["meshes"].asContainerOf<Mesh>() for clarity.

Iteration::meshes will point to the first user-specified meshes path that takes the form of a shorthand notation. E.g., after series.setMeshesPath({"fields/"}), the call iteration.meshes will be the same as iteration["fields"].asContainerOf<Mesh>(). This ensures backwards compatibility.

(Note: Since Iteration::meshes is unfortunately a member and not a method, this means that the meshes path must be set before creating or opening any Iteration. And it was enough fighting with pointers to get things to that state.)

2. There might be custom data inside /data/%T/meshes
This is not really a problem, but could be unexpected. When setting series.setMeshesPath({"/meshes/E"}), you state that only the E field is a mesh. Since /data/%T/meshes is otherwise "just a regular group" with no special meaning, there might be other data in there, too, e.g. /data/%T/meshes/custom/hierarchy. It's the job of the user to create a meaningful data layout here.

src/CustomHierarchy.cpp Fixed Show fixed Hide fixed
@franzpoeschel franzpoeschel force-pushed the topic-custom-hierarchies branch 2 times, most recently from 53f968c to ba10099 Compare August 1, 2023 13:37
}
}

TEST_CASE("custom_hierarchies_no_rw", "[core]")

Check notice

Code scanning / CodeQL

Unused static function Note

Static function C_A_T_C_H_T_E_S_T_6 is unreachable (
autoRegistrar7
must be removed at the same time)
@franzpoeschel franzpoeschel mentioned this pull request Aug 1, 2023
1 task
@franzpoeschel franzpoeschel force-pushed the topic-custom-hierarchies branch 4 times, most recently from 31c7a25 to 1d47d17 Compare August 3, 2023 09:25
Introduction of iteration["meshes"].asContainerOf<Mesh>() as a more
explicit variant for iteration.meshes.
TODO: Since meshes/particles can no longer be directly addressed with
this, maybe adapt the class hierarchy to disallow mixed groups that
contain meshes, particles, groups and datasets at the same time.

Only maybe though..
The have their own meaning now and are no longer just carefully maintained
for backwards compatibility.
Instead, they are supposed to serve as a shortcut to all openPMD data
found further down the hierarchy.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant