-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Zstd compression #560
base: master
Are you sure you want to change the base?
Add Zstd compression #560
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #560 +/- ##
==========================================
- Coverage 86.94% 86.92% -0.02%
==========================================
Files 31 31
Lines 4311 4321 +10
==========================================
+ Hits 3748 3756 +8
- Misses 563 565 +2 ☔ View full report in Codecov by Sentry. |
I'm not an expert on this package, but you seem to be correct. Line 160 in ccd60eb
|
Manual test: julia> using JLD2, CodecZstd
julia> A = zeros(1000,1000);
julia> A[1] = rand()
0.44335952762563824
julia> sizeof(A)/1000^2 # 8 MB array
8.0
julia> save("test_without_compression.jld2", "A", A)
julia> save("test_with_compression.jld2", "A", A, compress=ZstdCompressor())
julia> A == load("test_with_compression.jld2", "A")
true
julia> A == load("test_without_compression.jld2", "A")
true and I have two files on disk then
one uncompressed of about 8MB and one compressed to 1KB |
You should test if you can load the file via HDF5.jl.
|
Okay that's not working correctly julia> h5open("test_with_compression.jld2") do h5f
h5f["A"][]
end
1000×1000 Matrix{Float64}:
0.0 0.0 … 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
7.41892e-310 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1.95e-321 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
7.41892e-310 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1.42e-321 0.0 … 0.0 0.0 0.0 0.0 0.0 0.0 0.0
NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
7.41892e-310 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3.6e-322 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
⋮ ⋱ ⋮
0.0 4.0e-323 … 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 5.17085e170 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 9.56977e-315 … 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 1.11254e-308 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 1.11255e-308 0.0 0.0 0.0 0.0 0.0 0.0 0.0 (it should be A[1] == 0.4436 otherwise zero), even though the file without zstd compression is loaded correctly. Why is that? edit: The same works perfectly fine with |
So the problem is that we are using https://github.com/HDFGroup/hdf5_plugins/blob/master/ZSTD/src/H5Zzstd.c |
OK, I'm going to start working on a ZstdFrameCompressor / ZstdFrameDecompressor . I'm not sure if that is the final name yet, but it's basically the NotStream compressor / decompressor. I'm not completely sure how to make that work in the TranscodingStream API, but we'll see. |
Awesome, let me know if I can help! |
@mkitti I've written already the tests for |
I've set the compressor here from |
I think it should still be |
fixes #357
@mkitti I'm not sure what the 4-element tuple in
ID_TO_COMPRESSOR
is supposed to be? Package name, compressor name, decompressor name, some short name in caps?