Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception when reconstructing a type inside a collection with a specific type parameter #374

Open
Octogonapus opened this issue Feb 3, 2022 · 4 comments

Comments

@Octogonapus
Copy link

Octogonapus commented Feb 3, 2022

Description

Using Julia v1.6 and JLD2 v0.4.19, JLD2 fails to load a struct containing a vector of a type which must be reconstructed. In principal, this makes sense, as the reconstructed type isn't convertible to the required type on the vector. This behavior isn't very helpful, though, as the exception occurs inside JLD2 code so I don't have an opportunity to address the problem.

My code can provide a way to convert a reconstructed Foo.Inner into a true Foo.Inner. Would it be possible to pass this function to JLD2 somehow? I'm not sure how my conversion method would get the required type information, though. I can see that JLD2 has this information, since it creates type names like this JLD2.ReconstructedTypes.var"##Main.Foo.Inner#258". That is probably enough to implement the conversion function I have in mind. In our own code, we save the type name as a string next to the data, and then use that to convert any reconstructed types into the true types. We need to do this conversion in the first place because we need to dispatch on these types; dispatch won't work with reconstructed types.

I'd also like I note that I can help work on this change, if it's something that would be accepted.

Workaround

I could change the type of Outer.a to Vector, but this not ideal as I want to require a specific eltype on that vector.

I could also define a method of convert like this Base.convert(::Type{Inner}, other) = Inner(other.x, 0), but I can't think of a way to provide a tight enough type bound on other so that this function won't get accidentally called.

MWE

Step 1: Serialize

using JLD2, FileIO

module Foo

struct Inner
    x::Int
end

struct Outer
    a::Vector{Inner}
end

end

save("test.jld2", Dict("a" => Foo.Outer([Foo.Inner(1)])))

Step 2: Deserialize with an Added Field

using JLD2, FileIO

module Foo

struct Inner
    x::Int
    y::Int
end

struct Outer
    a::Vector{Inner}
end

end

load("test.jld2")

Exception

julia> load("test.jld2")
┌ Warning: saved type Main.Foo.Inner is missing field y in workspace type; reconstructing
└ @ JLD2 ~/.julia/packages/JLD2/L21uR/src/data/reconstructing_datatypes.jl:152
Error encountered while load File{DataFormat{:JLD2}, String}("test.jld2").

Fatal error:
ERROR: MethodError: Cannot `convert` an object of type JLD2.ReconstructedTypes.var"##Main.Foo.Inner#257" to an object of type Main.Foo.Inner
Closest candidates are:
  convert(::Type{T}, ::T) where T at essentials.jl:205
  Main.Foo.Inner(::Any, ::Any) at REPL[2]:4
Stacktrace:
  [1] setindex!(A::Vector{Main.Foo.Inner}, x::JLD2.ReconstructedTypes.var"##Main.Foo.Inner#257", i1::Int64)
    @ Base ./array.jl:843
  [2] _unsafe_copyto!(dest::Vector{Main.Foo.Inner}, doffs::Int64, src::Vector{JLD2.ReconstructedTypes.var"##Main.Foo.Inner#257"}, soffs::Int64, n::Int64)
    @ Base ./array.jl:235
  [3] unsafe_copyto!
    @ ./array.jl:289 [inlined]
  [4] _copyto_impl!
    @ ./array.jl:313 [inlined]
  [5] copyto!
    @ ./array.jl:299 [inlined]
  [6] copyto!
    @ ./array.jl:325 [inlined]
  [7] copyto_axcheck!
    @ ./abstractarray.jl:1056 [inlined]
  [8] Vector{Main.Foo.Inner}(x::Vector{JLD2.ReconstructedTypes.var"##Main.Foo.Inner#257"})
    @ Base ./array.jl:540
  [9] convert(#unused#::Type{Vector{Main.Foo.Inner}}, a::Vector{JLD2.ReconstructedTypes.var"##Main.Foo.Inner#257"})
    @ Base ./array.jl:532
 [10] jlconvert
    @ ~/.julia/packages/JLD2/L21uR/src/data/writing_datatypes.jl:302 [inlined]
 [11] macro expansion
    @ ~/.julia/packages/JLD2/L21uR/src/data/reconstructing_datatypes.jl:563 [inlined]
 [12] jlconvert(#unused#::JLD2.ReadRepresentation{Main.Foo.Outer, JLD2.OnDiskRepresentation{(0,), Tuple{Vector{Main.Foo.Inner}}, Tuple{JLD2.RelOffset}}()}, f::JLD2.JLDFile{JLD2.MmapIO}, ptr::Ptr{Nothing}, header_offset::JLD2.RelOffset)
    @ JLD2 ~/.julia/packages/JLD2/L21uR/src/data/reconstructing_datatypes.jl:508
 [13] read_scalar(f::JLD2.JLDFile{JLD2.MmapIO}, rr::JLD2.ReadRepresentation{Main.Foo.Outer, JLD2.OnDiskRepresentation{(0,), Tuple{Vector{Main.Foo.Inner}}, Tuple{JLD2.RelOffset}}()}, header_offset::JLD2.RelOffset)
    @ JLD2 ~/.julia/packages/JLD2/L21uR/src/dataio.jl:37
 [14] read_data(f::JLD2.JLDFile{JLD2.MmapIO}, rr::Any, read_dataspace::Tuple{JLD2.ReadDataspace, JLD2.RelOffset, Int64, UInt16}, attributes::Vector{JLD2.ReadAttribute})
    @ JLD2 ~/.julia/packages/JLD2/L21uR/src/datasets.jl:170
 [15] read_data(f::JLD2.JLDFile{JLD2.MmapIO}, dataspace::JLD2.ReadDataspace, datatype_class::UInt8, datatype_offset::Int64, data_offset::Int64, data_length::Int64, filter_id::UInt16, header_offset::JLD2.RelOffset, attributes::Vector{JLD2.ReadAttribute})
    @ JLD2 ~/.julia/packages/JLD2/L21uR/src/datasets.jl:149
 [16] load_dataset(f::JLD2.JLDFile{JLD2.MmapIO}, offset::JLD2.RelOffset)
    @ JLD2 ~/.julia/packages/JLD2/L21uR/src/datasets.jl:92
 [17] getindex(g::JLD2.Group{JLD2.JLDFile{JLD2.MmapIO}}, name::String)
    @ JLD2 ~/.julia/packages/JLD2/L21uR/src/groups.jl:108
 [18] getindex(f::JLD2.JLDFile{JLD2.MmapIO}, name::String)
    @ JLD2 ~/.julia/packages/JLD2/L21uR/src/JLD2.jl:379
 [19] loadtodict!(d::Dict{String, Any}, g::JLD2.JLDFile{JLD2.MmapIO}, prefix::String)
    @ JLD2 ~/.julia/packages/JLD2/L21uR/src/loadsave.jl:154
 [20] loadtodict!
    @ ~/.julia/packages/JLD2/L21uR/src/loadsave.jl:153 [inlined]
 [21] #62
    @ ~/.julia/packages/JLD2/L21uR/src/fileio.jl:39 [inlined]
 [22] jldopen(::JLD2.var"#62#63", ::String, ::Vararg{String, N} where N; kws::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ JLD2 ~/.julia/packages/JLD2/L21uR/src/loadsave.jl:4
 [23] jldopen
    @ ~/.julia/packages/JLD2/L21uR/src/loadsave.jl:2 [inlined]
 [24] #fileio_load#61
    @ ~/.julia/packages/JLD2/L21uR/src/fileio.jl:38 [inlined]
 [25] fileio_load(f::File{DataFormat{:JLD2}, String})
    @ JLD2 ~/.julia/packages/JLD2/L21uR/src/fileio.jl:38
 [26] #invokelatest#2
    @ ./essentials.jl:708 [inlined]
 [27] invokelatest
    @ ./essentials.jl:706 [inlined]
 [28] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Formatted; options::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ FileIO ~/.julia/packages/FileIO/QkYgA/src/loadsave.jl:219
 [29] action
    @ ~/.julia/packages/FileIO/QkYgA/src/loadsave.jl:197 [inlined]
 [30] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Symbol, ::String; options::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ FileIO ~/.julia/packages/FileIO/QkYgA/src/loadsave.jl:185
 [31] action
    @ ~/.julia/packages/FileIO/QkYgA/src/loadsave.jl:185 [inlined]
 [32] load(::String; options::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ FileIO ~/.julia/packages/FileIO/QkYgA/src/loadsave.jl:113
 [33] load(::String)
    @ FileIO ~/.julia/packages/FileIO/QkYgA/src/loadsave.jl:110
 [34] top-level scope
    @ REPL[3]:1
Stacktrace:
 [1] handle_error(e::MethodError, q::Base.PkgId, bt::Vector{Union{Ptr{Nothing}, Base.InterpreterIP}})
   @ FileIO ~/.julia/packages/FileIO/QkYgA/src/error_handling.jl:61
 [2] handle_exceptions(exceptions::Vector{Tuple{Any, Union{Base.PkgId, Module}, Vector{T} where T}}, action::String)
   @ FileIO ~/.julia/packages/FileIO/QkYgA/src/error_handling.jl:56
 [3] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Formatted; options::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ FileIO ~/.julia/packages/FileIO/QkYgA/src/loadsave.jl:228
 [4] action
   @ ~/.julia/packages/FileIO/QkYgA/src/loadsave.jl:197 [inlined]
 [5] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Symbol, ::String; options::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ FileIO ~/.julia/packages/FileIO/QkYgA/src/loadsave.jl:185
 [6] action
   @ ~/.julia/packages/FileIO/QkYgA/src/loadsave.jl:185 [inlined]
 [7] load(::String; options::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ FileIO ~/.julia/packages/FileIO/QkYgA/src/loadsave.jl:113
 [8] load(::String)
   @ FileIO ~/.julia/packages/FileIO/QkYgA/src/loadsave.jl:110
 [9] top-level scope
   @ REPL[3]:1
@Octogonapus Octogonapus changed the title Exception when reconstructing a type inside a collection Exception when reconstructing a type inside a collection with a specific type parameter Feb 3, 2022
@JonasIsensee
Copy link
Collaborator

Hi @Octogonapus ,
thanks for reporting this. I've labelled this as a bug.

Defining Base.convert(::Type{Foo.Inner}, x) = Foo.Inner(x.x, 0) works for me.
Another workaround is described in #242

@Octogonapus
Copy link
Author

I think this could be handled with a look-ahead inside constructrr, but that might be computationally expensive. The type of Outer.a would also need to be serialized properly. Right now, it's just Any instead of Vector{Inner}.

┌ Debug: constructrr
│   T = Main.Foo.Outer
│   dt = JLD2.CompoundDatatype(0x00000008, [:a], [0], JLD2.H5Datatype[JLD2.BasicDatatype(0x37, 0x00, 0x00, 0x00, 0x00000008)])
│   attrs =1-element Vector{JLD2.ReadAttribute}:
│     JLD2.ReadAttribute(:julia_type, JLD2.ReadDataspace(0x00, 0x00, 4891), 0xff, 560, 4891)
└ @ JLD2 ~/Documents/code/JLD2.jl/src/data/reconstructing_datatypes.jl:121
┌ Debug: field_datatypes
│   field_datatypes = JLD2.RelOffset[]
└ @ JLD2 ~/Documents/code/JLD2.jl/src/data/reconstructing_datatypes.jl:123
┌ Debug: isconcretetype(T)
│   isconcretetype(T) = true
└ @ JLD2 ~/Documents/code/JLD2.jl/src/data/reconstructing_datatypes.jl:126
┌ Debug: dtnames & mapped
│   dtnames =
│    Dict{Symbol, Int64} with 1 entry::a => 1
│   mapped =1-element BitVector:0
└ @ JLD2 ~/Documents/code/JLD2.jl/src/data/reconstructing_datatypes.jl:138
┌ Debug: determine dtrr
│   field_datatypes = JLD2.RelOffset[]
│   dtindex = 1
│   dt.members =1-element Vector{JLD2.H5Datatype}:
│     JLD2.BasicDatatype(0x37, 0x00, 0x00, 0x00, 0x00000008)
└ @ JLD2 ~/Documents/code/JLD2.jl/src/data/reconstructing_datatypes.jl:161
┌ Debug: dtrr
│   dtrr = JLD2.ReadRepresentation{Any, JLD2.RelOffset}()
│   typeof(dtrr) = JLD2.ReadRepresentation{Any, JLD2.RelOffset}
│   (typeof(dtrr)).parameters = svec(Any, JLD2.RelOffset)
└ @ JLD2 ~/Documents/code/JLD2.jl/src/data/reconstructing_datatypes.jl:168
┌ Debug: constructrr type intersection
│   readtype = Any
│   wstype = Vector{Main.Foo.Inner} (alias for Array{Main.Foo.Inner, 1})
│   typeintersect(readtype, wstype) = Vector{Main.Foo.Inner} (alias for Array{Main.Foo.Inner, 1})
│   hasmethod(convert, Tuple{Type{wstype}, readtype}) = false
└ @ JLD2 ~/Documents/code/JLD2.jl/src/data/reconstructing_datatypes.jl:171

If readtype was Vector{Inner} instead of Any, and we knew if reconstructing Inner was required, then we would know to reconstruct Outer. I'm not too familiar with this package's internals, can you think of a way to do that? I can work on implementing it.

I may also work on implementing a "reconstructed types only" feature, as described in #242.

@JonasIsensee
Copy link
Collaborator

If readtype was Vector{Inner} instead of Any, and we knew if reconstructing Inner was required, then we would know to reconstruct Outer. I'm not too familiar with this package's internals, can you think of a way to do that? I can work on implementing it.

I've looked at the source code and the files. Sadly I'm pretty sure this is not possible.
JLD2 does not store the julia-eltype of structs. It only stores the HDF5 types of fields.
It's easy to map this in the simple cases of e.g. floats and ints but arrays are mutable objects
and are therefore stored separately and referenced by a RelOffset.

constructrr cannot know the field type of Outer.a - not without a serious reworking of JLD2 internals. (See #316 and #338 for my unfinished attempts)

I believe it may be more useful to implement the explicit type remapping that's already working in #316.
If you're interested in contributing, have a look at #316 and what I did with the type remapping.
Essentially, every file gets a dictionary that maps datatype names to datatypes in the running session.
This already worked nicely but #316 tried to do anonymous functions as well, which I didn't finish.

@JonasIsensee
Copy link
Collaborator

Since I had the code lying around already I ended up doing the above myself in #376 .
I hope this helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants