You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Reading in all 332 basins for the GCP VM SnowpackStatisticsByBasin/ CSV files, and the locally created 332 basins is very slow. I have tried a few different test set ups to debug:
Since the GCP files contain about 17 years worth of daily data, I took out some basins and slowly put them back into the SnowpackStatisticsByBasin/ folder. The results:
100 basins: 57 seconds
150 basins: 1 minute 37 seconds
240 basins: 3 minutes 31 seconds
294 basins: 5 minutes 59 seconds
From what I saw, memory never went above 2.75 GB. The command file is quite small at this point, so there's really not much going on.
The text was updated successfully, but these errors were encountered:
The performance does not seem unreasonable. Although ideally runs should be as fast as possible, processing a lot of data can take time. 332 basins * 3 time series = 996 time series. 3 time series with 365.25 points/year and 17 years gives 18628 data points per basin and 6,184,000 data points total for time series, as 4 byte double is 24.7 MB just for time series data. There is other memory being used. The point is that memory should not be a problem.
The increase in run time is not linear but it is not crazy exponential either. Maybe it is what it is.
There is potential that that the VM or combination of Linux and Windows is slow in other ways such as I/O. There may also be some unintended inefficiencies in the command file that an be identified with review. The output may be getting buffered some weird way but usually the UI shows steady progress unless one command really is slow.
I suggest working out the details on the time series comparison using fewer stations (even 1 station) and then run the big comparison. I usually add commands to make it easy to switch between the short and long runs.
Also, the ProfileCommands command under Running and Properties menu will track command performance, but maybe I need to have a general performance check.
Reading in all 332 basins for the GCP VM
SnowpackStatisticsByBasin/
CSV files, and the locally created 332 basins is very slow. I have tried a few different test set ups to debug:SnowpackStatisticsByBasin/
folder. The results:From what I saw, memory never went above 2.75 GB. The command file is quite small at this point, so there's really not much going on.
The text was updated successfully, but these errors were encountered: