Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

save julia objects as a self contained group in a "normal" hdf5 file #398

Closed
tkuraku opened this issue May 28, 2022 · 4 comments
Closed

Comments

@tkuraku
Copy link

tkuraku commented May 28, 2022

This is a feature request that I had when I was interacting with some hdf5 files that were created from a separate program. With jld2 I can't save a Julia object into an existing hdf5 file because it isn't a "JLD2" file. Instead of making the file a JLD2 file, it would be super useful to have each saved Julia object to be a self contained jld2 group so that they could readily be saved into an existing "normal" hdf5 file. This would be awesome for interoperability for files that don't originate from Julia. In the same vain standard data types such as standard arrays could be transparently written and read like how they are in HDF5.jl, and only more complex Julia objects would need to be serialized into a JLD2 group.

@JonasIsensee
Copy link
Collaborator

Hi @tkuraku ,

In the same vain standard data types such as standard arrays could be transparently written and read like how they are in HDF5.jl, and only more complex Julia objects would need to be serialized into a JLD2 group.

this is already the case.

On the more general case:
to do what you are suggesting, you need to be able to read and edit normal hdf5 files. This is hard.
JLD2 does not rely on the hdf5 binary dependency and instead reimplements a part of the hdf5 format spec. (which is huge - lots of bit-fiddling)

Because I also found this lack of cooperability quite frustrating, I created #388 .
With this PR it is possible to read a large portion of hdf5 files.
However, the format spec does only states what a file needs to look like, to be valid - not how to create it.
JLD2 files use the simplest way to create these files and leave out many optimizations. ( groups are really just lists and not funky heap structures) This is reasonably easy to produce and edit.
However, to edit just any hdf5 file out there, you would need to implement the algorithms to update the more complex structures.

This is possible but also a significant undertaking.

@tkuraku
Copy link
Author

tkuraku commented May 28, 2022

I see, if you don't rely on the hdf5 binary it becomes a very big task indeed. Maybe a alternative for the case when you need cooperability have the jld2 provide a serialization and de-serialization function that can be used in conjunction with HDF5.jl using the hdf5 binaries, and in the case when you don't you can use the jld2 library as is? Maybe something like this:

h5open(data_file, "w") do fd
    fd["julia_object"] = jld2.serialize(my_object)
end

h5open(data_file, "r") do fd
    my_object = jld2.deserialize(fd["julia_object"])
end

Just an idea. Thanks for all your hard work!

@pvillacorta
Copy link

Hi,
I am really interested in this. I would need to store julia objects (specifically, functions) as a subset of a bigger HDF5 file. Have there been any new developments regarding this?
Thank you.

@JonasIsensee
Copy link
Collaborator

Hi, I am really interested in this. I would need to store julia objects (specifically, functions) as a subset of a bigger HDF5 file. Have there been any new developments regarding this? Thank you.

functions are the one main thing that JLD2 cannot store. (only by referencing the name...)

@JonasIsensee JonasIsensee closed this as not planned Won't fix, can't repro, duplicate, stale May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants