High memory used in data loads not immediately returned to OS #3002

SoftTools59654 · 2024-02-10T18:52:14Z

Of course, this problem is probably related to zed itself
When a large data file is imported, a lot of RAM is consumed and after the data import is finished
Zed still occupies the same amount of RAM in memory while there are no queries or requests or imports. which causes high RAM consumption in Zui

Windows operating system

While after the import is finished, it can free up the used RAM space

philrz · 2024-02-12T23:04:00Z

@SoftTools59654: Yes, the phenomenon you describe is indeed related to Zed, and more specifically the Go programming language in which Zed is written, and the same effect appears in other languages also. There's many articles/discussions online on this topic, with this being one.

To quantify, in our internal Autoperf system that we use to track performance and resource usage with various operations, we've seen loads of large data sets consume 4-7 GB of system memory at peak, whereas the various query workloads consume 100-300 MB of system memory at peak, though this may certainly go higher for other query workloads (e.g., particular aggregations).

In terms of how to mitigate these effects, one short term workaround would be to exit and relaunch Zui after a large data load since that would cause the zed serve process that consumed excess memory during data load to exit and a new one to launch with a lower memory footprint. I understand this is not a very "user-friendly" workaround, but it's available for immediate use on a simple Zui desktop. Another approach would be to make use of a remote Zed lake so that way the zed serve process handling the large data loads could be run on a remote server with more available memory.

As for long-term fixes, we have plans in the future to do some wider research on how we might unify memory allocations in Zed (brimdata/zed#4025) and maybe at that time we'll come across an approach that could allow the release of memory to the OS. In advance of this, as a short-term workaround we might consider an approach of spawning a separate/temporary zed serve process for each individual data load operation, since that process's memory would be freed back to the OS when it exits. Overall, however, due to the core Dev team being tied up on other priorities we may not be able to allocate time to these enhancements for some time.

A question for @SoftTools59654 or anyone else who happens to find this issue: Could you help understand the impact of this effect in your environment? If you primarily wanted an explanation for the behavior, I hope the text above is sufficient. If the effect is limiting your ability to use Zui at all, it would be helpful to understand the resource constraints and impact in your environment.

philrz · 2024-02-13T20:46:10Z

@SoftTools59654: After speaking with our developers I ran some more tests and I can see that lots of memory absolutely can be returned to the OS, but it's dependent on use.

In the attached video below, I'm using Zui v1.6.0 and loading these three JSON files that are ~1.9 GB in size after gunzip.

$ curl -O https://data.gharchive.org/2023-02-08-0.json.gz
$ curl -O https://data.gharchive.org/2023-02-08-1.json.gz
$ curl -O https://data.gharchive.org/2023-02-08-2.json.gz

The video is "sped up" by about 5x to reduce its size. As you can see, when the files are finished loading, the zed process has claimed a peak of almost 2.5 GB of memory from the OS, and if I let it sit there in that state (as I do there for a couple minutes... observe the clock in the lower-right corner of the video) it seems it will stay in that state indefinitely. However, certain activity (such as the act of clicking Query Pool to trigger some queries against the Zed lake) may trigger Go's Garbage Collection to become active, and this may result in significant memory being returned to the OS, such as happens here where it settles back around ~600 MB.

Windows.mp4

So you can expect the memory usage will often rise with certain heavy operations (such as the load that you saw, or some other memory-heavy operations like sort) but ongoing operations that require less memory (such as queries) should trigger what's shown here where memory will be freed up. The effect is not instant and there's no guarantee that it will always converge at the minimal reported memory usage the app was at when it first launched, though it certainly may go quite low (I observed it get as low as ~100 MB in another test). The dynamics of Garbage Collection is ultimately a complex topic all its own. But based on what I've observed here I'd summarize that it's behaving in a manner that seems reasonable for a system like this that processes potentially large amounts of data in ways that often need significant memory.

I'd still be curious to hear more on if these behaviors still provide some unreasonable burden in your deployment environment, though. I'll hold this open and see what answer you might have.

SoftTools59654 added the bug Something isn't working label Feb 10, 2024

philrz changed the title ~~RAM consumption and not decreasing zed~~ High memory used in data loads not returned to OS Feb 12, 2024

philrz added the community label Feb 12, 2024

philrz changed the title ~~High memory used in data loads not returned to OS~~ High memory used in data loads not immediately returned to OS Feb 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High memory used in data loads not immediately returned to OS #3002

High memory used in data loads not immediately returned to OS #3002

SoftTools59654 commented Feb 10, 2024

philrz commented Feb 12, 2024

philrz commented Feb 13, 2024 •

edited

High memory used in data loads not immediately returned to OS #3002

High memory used in data loads not immediately returned to OS #3002

Comments

SoftTools59654 commented Feb 10, 2024

philrz commented Feb 12, 2024

philrz commented Feb 13, 2024 • edited

philrz commented Feb 13, 2024 •

edited