Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bitrounding + Lossless compression #3599

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

Bitrounding + Lossless compression #3599

wants to merge 3 commits into from

Conversation

milankl
Copy link
Collaborator

@milankl milankl commented May 13, 2024

TL;DR: We can compress 18GB of Oceananigans simulation checkpoints into 350MB with bitrounding and lossless compression.

Problem

Output is currently uncompressed in Float64 which contains

  • redundancies: zeros for immersed boundaries, halos; similar/identical exponent bits
  • false information: tailing mantissa bits with no mutual information to neighbouring grid points

Proposed solution

Bitrounding to remove false information (replaced with zero bits -> redundancies) then lossless compression to remove redundancies.

I've looked into the bitwise real information content for a single checkpoint in Simone's OMIP simulations and I got this with the orange line denothing the 99.9% of real information

image

So

  • u, v have 0-2 mantissa bits of information (=keepbits) with more information in the surface layer (k=60)
  • w has 0 keepbits (exponent bits though!)
  • tempreture T (in ˚C) has 7 keepbits (that's 3-4 digits) relatively independent of depth
  • salinity S has 12 at the surface which however increases to 16 in the deep ocean
  • sea surface height $\eta$ is at 6 keepbits
  • tendencies are generally lower but maybe then shouldn't be stored anyway (use single Euler forward instead)

The checkpoint file Simone provided had

  • 18GB total file size, single time step
  • including 7 halo points in all directions
  • 400MB are grid
  • u,v,w,T,S,$\eta$ variables and 2x tendencies (AB2) for all but w, all in Float64

Compression options

The 18GB can be compressed into

  • Only lossless: 6.9GB (2.6x), removes redundancies from halo and immersed boundaries
  • Only Float32: 9GB (2x), removes only some false information in tailing bits
  • Float32 then lossless: 3.25GB (5.5x)
  • Bitrounded then lossless: 1GB (18x)
  • Bitrounded, zero tendencies, then lossless: 350MB (51x), with lossy compression saving the tendencies becomes eventually pointless as restarting with a single Euler forward step might just do the job anyway

This currently uses Zstd (https://github.com/facebook/zstd), a modern yet already widely available lossless compressor through its commandline interface zstd. With JLD2 at the moment compress=true uses ZlibCompressor from https://github.com/JuliaIO/CodecZlib.jl which is similarly good but 2-3x slower. I'm working on getting CodecZstd supported in JLD2: JuliaIO/JLD2.jl#560

While this PR is still a draft I'm proposing the new defaults

  • lossless compression with compress=true for JLD2, deflatelevel=3 for netCDF
  • bitrounding to keepbits ~20 (single precision-ish) whether you output in Float32/64 (doesn't matter when lossless compression is on)
  • a default bitrounder that rounds to the keepbits as suggested above that can be used instead of bitrounder=nothing (default)

We can then independently tweak the precision (how many keepbits, ideally as a function of the vertical, see salinity) and the lossless compressor (Zlib -> Zstandard)

@milankl
Copy link
Collaborator Author

milankl commented May 13, 2024

Just added BitInformation to the Project.toml, due to dependency on StatsBase and Distributions this also adds

    Updating `~/git/Oceananigans.jl/Project.toml`
  [de688a37] + BitInformation v0.6.1
    Updating `~/git/Oceananigans.jl/Manifest.toml`
  [66dad0bd] + AliasTables v1.1.2
  [de688a37] + BitInformation v0.6.1
  [49dc2e85] + Calculus v0.5.1
  [31c24e10] + Distributions v0.25.108
  [fa6b7ba4] + DualNumbers v0.6.8
  [1a297f60] + FillArrays v1.11.0
  [34004b35] + HypergeometricFunctions v0.3.23
  [77ba4419] + NaNMath v1.0.2
  [90014a1f] + PDMats v0.11.31
  [1fd47b50] + QuadGK v2.9.4
  [79098fc4] + Rmath v0.7.1
  [2913bbd2] + StatsBase v0.34.3
  [4c63d2b9] + StatsFuns v1.3.1
  [f50d1b31] + Rmath_jll v0.4.0+0

also why is the Manifest.toml committed?

@glwagner
Copy link
Member

Just added BitInformation to the Project.toml, due to dependency on StatsBase and Distributions this also adds

    Updating `~/git/Oceananigans.jl/Project.toml`
  [de688a37] + BitInformation v0.6.1
    Updating `~/git/Oceananigans.jl/Manifest.toml`
  [66dad0bd] + AliasTables v1.1.2
  [de688a37] + BitInformation v0.6.1
  [49dc2e85] + Calculus v0.5.1
  [31c24e10] + Distributions v0.25.108
  [fa6b7ba4] + DualNumbers v0.6.8
  [1a297f60] + FillArrays v1.11.0
  [34004b35] + HypergeometricFunctions v0.3.23
  [77ba4419] + NaNMath v1.0.2
  [90014a1f] + PDMats v0.11.31
  [1fd47b50] + QuadGK v2.9.4
  [79098fc4] + Rmath v0.7.1
  [2913bbd2] + StatsBase v0.34.3
  [4c63d2b9] + StatsFuns v1.3.1
  [f50d1b31] + Rmath_jll v0.4.0+0

also why is the Manifest.toml committed?

Through past experience we found that we needed the Manifest committed to make sense of the errors we encounter during CI.

@@ -326,7 +320,7 @@ simulation = Simulation(model, Δt=1.25, stop_iteration=3)

f(model) = model.clock.time^2; # scalar output

g(model) = model.clock.time .* exp.(znodes(grid, Center())) # vector/profile output
g(model) = model.clock.time .* exp.(znodes(Center, grid)) # vector/profile output
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to merge main because I think we need this change for the doctest to pass

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm two commits ahead of main, none behind main...mk/compression I haven't actively changed these, but maybe @simone-silvestri and I started off from an outdated branch?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes not sure, but this change does walk back a recent PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shoot then I might have created a non-consistent history, sorry, I'll try to resolve that.

Comment on lines +12 to +13
default_bit_rounding(::Val{:T}) = 7
default_bit_rounding(::Val{:S}) = 16 # 12 at the surface, 16 deep ocean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting. Why is there a difference between T, S? Is this specific to the simulation that this was tested on, or can we be sure this is valid for all simulations, past climates, future climates, idealized simulations at other resolutions, etc?

It seems we need to have default bit rounding for passive tracers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although relatively robust through time and space, this depends on a lot of things, also whether your unit carries some offset around (e.g. Kelvin vs ˚C, density vs density anomaly). So it's tricky to generalise. I suggest to have some reasonable defaults if someone uses bit rounding (default nothing or single precision as you like) but suggest to highlight that this should be checked similar to how I did it here with the bitinformation analysis above.

For global ocean simulations I expect these to be reasonable defaults. I believe for now this is mostly to reduce the filesizes for OMIP simulations

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I'm just not sure that OMIP is going to be the most common use case, so there's a question about what default is appropriate here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OMIP defaults might belong in the ClimaOcean setup, perhaps

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could set the defaults here as 23 mantissa bits (=Float32 precision, whether you use Float32 or 64) and then lower in ClimaOcean?

Comment on lines +16 to +17
function BitRounding(outputs = nothing;
user_rounding...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the purpose of figuring out good defaults perhaps we should include model as an input here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then default_bit_rounding can take model as an argument, and dispatch on various things, for example the equation of state (which should know the units of temperature), and perhaps the biogeochemistry model, which may know the units of some important tracers

Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants