-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
speed of cfa #736
Comments
Thanks, Jonathan. I shall investigate ... |
However, at only 4 minutes to aggregate on the fly ... |
Here are my aggregate/write times:
In [19]: %time f = cf.read('*.pp')
CPU times: user 3min 45s, sys: 2.36 s, total: 3min 47s
Wall time: 3min 54s
In [20]: len(f)
4069
In [21]: %time cf.write(f, 'delme.nca', cfa=True)
CPU times: user 1h 37min 15s, sys: 56.8 s, total: 1h 38min 12s
Wall time: 1h 39min 22s
In [22]: !du -sh delme.nca
25M delme.nca |
Dear @davidhassell Thanks for the tests. Four minutes is quick for aggregation. That is an impressive speedup, indeed. However, it's too long to wait for accessing a dataset when doing interactive analysis. If you could speed it up by another factor of 100, it would be fine. 😄 My test on To make the Best wishes Jonathan |
Some more information. I can ... We've just discussed this. Your experiment shows that it's not aggregating the specific humidity fields. That would explain why there are so many output fields. It does aggregate all the others, you say, but yet it still takes 30 minutes to write the 67 (I suppose) aggregated fields to the CFA file, without data. |
|
Part of this is addressed by #737 (ensuring we write 71 fields as intended, rather than 4069!), but that is not that is not the whole story. Tests are ongoing, and I'll write up the answer soon. |
Dear @davidhassell and @sadielbartholomew
A few months ago I recall David reporting much faster time for
cfa
processingpp
files. I've installed the latest version of cf-python and dependencies, I believe:In
/storage/basic/baobab/jonathan/general/exprzb.000100
on the RACC I am executingcfa -f CFA4 -o nca *.pp
. The directory contains 42,000pp
files, each containing onepp
field. So far, it has been executing for a couple of hours. Should it take this long?Best wishes and thanks
Jonathan
The text was updated successfully, but these errors were encountered: