New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optional plain HDF5 data mapping without reconstruction of data source type #505
Comments
Along that lines, I do think that the same people who would be interested in something like this, would be interested in syntactic sugar for #504, i.e. something like |
That sounds nice. This syntax |
that sounds nice! |
I like this idea of an optional The place to start, I'd say, is to define the expected behaviour and desired API style.
Another Quirk: One idea: Another thing to consider is #487 . Julia can construct arbitrarily large structs i.e. through structs with large ntuples as fields or through code generation. JLD2 has a hard limit on the size of the type description. Last thing is type stability / compilation / run time. |
@koehlerson you probably didn't get notifications after I first commented with an incomplete message. To start playing around with ideas, one doesn't really need to work with JLD2 at all.
and build an interface that does the desired things. |
Hey sorry for the late reply! Conversion function vs list of fields
So far, I only used some of the fields directly, but I can imagine that some people may want to compute something else before saving it. So, I think some kind of function would be nice to have a possible flexibility. Or maybe some folks want to compute a different representation for saving (thinking of #487 and the variety of array layouts, ML model layouts etc) compared to the layout that is present in the struct's field Composability
I'm not sure if I can follow 100% but what I would imagine is that either you have the bijection or you have the full nesting approach, i.e.
Why I'm unsure if I can follow is the following
Do you mean that you cannot recreate the struct that was saved? From my perspective this would be the desired behavior that you opt out of bijection and only save e.g. a NamedTuple with "crucial information" in a more simplistic way (primitives, isbitstype, ...) and if a user wants to rebuild something nothing is guaranteed and its up to the user to save as much as needed to rebuild on their own by some custom function
This is a very good point, since we have in Ferrite.jl also some singletons that are exposed to the user. For me personally I'd be fine with saving a string, but I'm not sure if there is any downside to it.
Maybe the set of stuff that is supported by HDF5.jl ? https://juliaio.github.io/HDF5.jl/stable/#Supported-data-types If I understand correctly, then, what you are trying to say is that a Other remarks
This sounds nice, especially since this solves in my head somewhat a user experience problem. Usually I have one big simulation struct with the parameters and I want to reconstruct it, but everything else is okay to be a
Does this mean that the ML model from the issue is serialized by JLD2 and the type parameters are too large which could be dodged by utilizing the "plain HDF5" approach with NamedTuple?
That sounds nice, will do that as soon as you gave some feedback, because I'm quite unsure to what extend the ideas make sense. My thinking is probably a bit too narrow towards my specific problem, so, happy to hear other perspectives :) |
Here's an example of a fundamental problem.
|
I'm sorry, this is a complex topic and so my answers will be a bit disorganized.
|
A place to start experimenting might be in usability of Here is an experimental package I built at some point. It has some tooling for retrieving type info from JLD2 files. The second (orthogonal) approach would be to implement a function that does all the conversions you can think of prior to handing it to JLD2. |
Please test out #522. I can't say it's elegant but it worked for my test cases. |
As discussed in Ferrite-FEM/Ferrite.jl#678 it would be nice to "dump" data sometimes, especially for the beginning of a computational project where structs change. The custom serialization interface offers a way to bijectively map from
A->Aserialized
(and vice verca). This, however, is sometimes not needed, instead, sometimes only certain fields need to be stored that can be expressed as Julia primitives/isbitstypes or arrays thereof.As an example: Let's say I have a simulation with some struct that holds a state of a material that I'm simulating. This struct is not needed to reproduce the simulation but serves more as "intermediate" results that maybe relevant for postprocessing purposes. So, there is not really the need to store the full type or a custom serialization of it. Instead, I just want to store a single scalar of this type. See https://ferrite-fem.github.io/Ferrite.jl/stable/examples/plasticity/ especially the
MaterialState
. You only need the fields (all areisbitstype
or primitives).What I envision is some function, let's call it for the sake of this issue
stuff_to_store
which is by default dispatched on::Type{Any}
and is in this case the identity mapping. However, a user can now overload this function for their own struct and specify aNamedTuple
of things to save. What should be expected from the user is that all fields of theNamedTuple
are directly supported by HDF5 due to primitives,isbitstypes or arrays of these things. The tricky bit is now to call this for whatever should be stored and check ifstuff_to_save
is dispatched for the current object. If so, theNamedTuple
is saved. In the case it isn't dispatched, the usual JLD2 machinery should kick in with all the nice features that are already implemented (including custom serialization).By this, one could achieve an optional "plain HDF5" storage behavior in JLD2. I'm not quite sure where to start to tackle this problem, but as @JonasIsensee pointed out a seperate package on top of JLD2 would be a good start. In order to be helpful I'd need some guidance where we could sneak in such dispatches. The aforementioned things are already packed with implementation detail even tho it isn't meant to be. More or less everything should be understood as an example how to get to an optional "plain HDF5" storage scheme within JLD2.
The text was updated successfully, but these errors were encountered: