Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how should "fill_value": null be read? #97

Open
gdkrmr opened this issue Oct 19, 2022 · 3 comments
Open

how should "fill_value": null be read? #97

gdkrmr opened this issue Oct 19, 2022 · 3 comments

Comments

@gdkrmr
Copy link
Contributor

gdkrmr commented Oct 19, 2022

We have a zarray with the following .zarray:

{"dtype":"<f4","filters":null,"shape":[15706,720,1440],"order":"C","zarr_format":2,"chunks":[5844,60,60],"fill_value":null,"compressor":{"blocksize":0,"id":"blosc","clevel":5,"cname":"lz4","shuffle":1}}

when reading data from the directory I get the following error

julia> data = zopen(path)
julia> data[1, 1, 1]
ERROR: MethodError: Cannot `convert` an object of type Nothing to an object of type Float32
Closest candidates are:
  convert(::Type{T}, ::Base.TwicePrecision) where T<:Number at twiceprecision.jl:273
  convert(::Type{T}, ::AbstractChar) where T<:Number at char.jl:185
  convert(::Type{T}, ::CartesianIndex{1}) where T<:Number at multidimensional.jl:130
  ...
Stacktrace:
  [1] fill!(dest::Array{Float32, 3}, x::Nothing)
    @ Base ./array.jl:351
  [2] readchunk!(a::Array{Float32, 3}, z::ZArray{Float32, 3, Zarr.BloscCompressor, DirectoryStore}, i::CartesianIndex{3})
    @ Zarr ~/.julia/packages/Zarr/tmr2s/src/ZArray.jl:188
  [3] (::Zarr.var"#66#69"{Bool, OffsetArrays.OffsetArray{Float32, 3, Array{Float32, 3}}, ZArray{Float32, 3, Zarr.BloscCompressor, DirectoryStore}, CartesianIndices{3, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}}, Array{Float32, 3}})(bI::CartesianIndex{3})
    @ Zarr ~/.julia/packages/Zarr/tmr2s/src/ZArray.jl:161
  [4] foreach
    @ ./abstractarray.jl:2774 [inlined]
  [5] readblock!(aout::Array{Float32, 3}, z::ZArray{Float32, 3, Zarr.BloscCompressor, DirectoryStore}, r::CartesianIndices{3, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}}; readmode::Bool)
    @ Zarr ~/.julia/packages/Zarr/tmr2s/src/ZArray.jl:151
  [6] readblock!
    @ ~/.julia/packages/Zarr/tmr2s/src/ZArray.jl:143 [inlined]
  [7] readblock!
    @ ~/.julia/packages/Zarr/tmr2s/src/ZArray.jl:174 [inlined]
  [8] getindex_disk
    @ ~/.julia/packages/DiskArrays/VGEpq/src/diskarray.jl:31 [inlined]
  [9] getindex(::ZArray{Float32, 3, Zarr.BloscCompressor, DirectoryStore}, ::Int64, ::Int64, ::Int64)
    @ DiskArrays ~/.julia/packages/DiskArrays/VGEpq/src/diskarray.jl:177
 [10] top-level scope
    @ REPL[30]:1

I think the error comes from fill_value:null. The data can be read with xarray. Is this error correct behavior?

@MartinuzziFrancesco

@meggart
Copy link
Collaborator

meggart commented Oct 20, 2022

The question is what you would expect? Yes, xarray reads the data but it is returning garbage for non-existing chunks. The problem is that a zarr array with fill_value null which has missing chunks is per se not well-defined. This was a long discussion in the python zarr impl and the conclusion was that fill_value=null will be forbidden in zarr v3.

I think that throwing an error is better than returning garbage data, but of course the error message should be improved.

@MartinuzziFrancesco
Copy link

A suggestion to deal with this could probably be to add a warning message that there are some missing values and fill them with nothings, dispatching over the base convert function. At the moment even for a single missing value for a full timeseries it's not possible to read it because of this error. This could also allow the user to actually count the missing values

@gdkrmr
Copy link
Contributor Author

gdkrmr commented Oct 20, 2022

I think that throwing an error is better than returning garbage data

I agree with you here. Throwing an error is correct. We could also return an Array{Union{Nothing, Float32}}, this should behave similarly to Array{Union{Missing, Float32}}, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants