-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backend memory leak #514
Comments
This is normal behaviour for node, it has a virtual memory, and tries to use the max of it. The machine it's hosted on must have more ram than the max GC virtual memory. We can either reduce node's space (eg with |
I come from the Java world and the mail servers operate on CNB architecture with a 2GB heap without issue (of course MX setting defining the memory used by the JVM is set in accordance with the kubernetes limits). So yes nodejs do not run alone there, and we need to set both limits in a reasonnable way. I do think that a couple GBs shall be enough for a not-that-stressed web app. I encourage you updating the tdrive helm charts accordingly. Second, again coming from the java world, having an over-sized old generation is suspicious at best, especially when implementing a REST API which is suppose to employ a per-request disconnected model. Is tdrive doing caching? Unproperly configured caching can lead to such behaviours... Finally I am not a node expert but some other node application we have around do not exhibit such abnormal behaviours - tu mention the OpenPaaS nodes on the same deployment they stay well below the 500MB space. All this points toward TDrive memory needing a better management. |
Please proceed to a memory dump and analyse the memory content. (It would be in java I would hapilly do a heap dump and analyse it with VisualVM...) |
Hi, |
(and if there is a leak, we should look at it once node is the one catching it, because the OOM will probably be unrelated to our code, since node expects that space available) |
If think the word "you" is misused here. I am not part of the tdrive team, nor an ops, just a fellow dev working on another product reporting some issues he did see. Agreed to push memory limit down to the tdrive application. I still believe there's a deeper issue lurking in there... |
Perhaps there is, but this is the expected behaviour even if there weren't a deeper issue. If you want to make such an issue more visible, then instead of increasing the VM, we should decrease node's space to stress it earlier, until it's truly deployed and if we remember to change it back before, this is probably a better approach from a diagnostic perspective |
Describe the bug
Over time tdrive consume more and more memory.
At 10am :
At 12:
This eventually lead to the 2GB quota to be exceeded and the bot to be salvaged:
To Reproduce
Run and monitor Twake drive in production
Expected behavior
No memory leak. I expect tdrive memory consumption to reach a stable point.
The text was updated successfully, but these errors were encountered: