Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High level problem case: Files with A LOT OF VARIABLES #736

Open
jbusecke opened this issue Apr 17, 2024 · 0 comments
Open

High level problem case: Files with A LOT OF VARIABLES #736

jbusecke opened this issue Apr 17, 2024 · 0 comments

Comments

@jbusecke
Copy link
Contributor

I have been working on refactoring the community bakery at LEAP (#735) and have one interesting problem case here: https://github.com/leap-stc/wavewatch3_feedstock (particularly see the code in leap-stc/wavewatch3_feedstock#1

This dataset is different from many others in at least two ways AFAICT now:

  • The files are extremely heavily compressed (3GB file, 17GB in memory)
  • A TON of variables!

Together this blows up the memory. I have tested running the recipe with dropping every variable but one and it works fine (still consumes a lot of memory but succeeds fine).

I think at the base the problem here is that a fragment with ~100MB chunksize on a single variable is still extremly large (~2-3GB) and as such the workers try to load a bunch of them eagerly and blow up.

I tried just throwing more RAM at the problem (800GB RAM was not enough!!!), but this dataset is very large in total and I think eventually I would have to be able to load the whole thing into memory, which really is not the point of doing this.

My current suspicion is that for cases like this we might want to consider not only splitting fragments out by dimension indicies, but also splitting across variables? Not at all sure how to achieve this, but wanted to record this as an interesting failcase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant