You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some operations are memory hogs when operating on large results. For example, diff-reduce starts with one pandas DataFrames for all of the results, and gradually builds up another DataFrame with a subset of those results. When the CSV file for the full results is several G in size, this ends up using a lot of RAM.
It is probably worth breaking the results into chunks where possible, and writing out to disk. So, for example, diff-reduce could append each processed group of results to a file as CSV rather than keeping them in memory.
The text was updated successfully, but these errors were encountered:
Some operations are memory hogs when operating on large results. For example, diff-reduce starts with one pandas DataFrames for all of the results, and gradually builds up another DataFrame with a subset of those results. When the CSV file for the full results is several G in size, this ends up using a lot of RAM.
It is probably worth breaking the results into chunks where possible, and writing out to disk. So, for example, diff-reduce could append each processed group of results to a file as CSV rather than keeping them in memory.
The text was updated successfully, but these errors were encountered: