-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JVM Kill Agent not printing summary of memory spaces, Internal JVM Error instead #500
Comments
The memory summary is produced by invoking MXBeans using JNI. It seems that the problem here is that jvmkill is being driven under a (JIT) compiler thread and is then unable to make such JNI calls due to a restriction in Hotspot. Unfortunately, unless this restriction is lifted, jvmkill will not be able to produce a memory summary in these circumstances. I think the best we can do is attempt to diagnose the case where we are driven under a compiler thread and, in that case, print a suitable warning (which may infer something about the kind of memory pool that is being exceeded, i.e. something to do with compilation) and bypass any actions which involve JNI calls, including producing a memory summary. |
I can't find any way to determine whether jvmkill is being driven under a compiler thread as the relevant header files are internal to Hotspot and not exposed on JVMTI or JNI. It seems that the most likely root cause here is (tiered) compilation consuming more code cache than is available. I'm surprised that the resource exhaustion exit is driven: I would hope the compiler would simply stop when it runs out of code cache. @jtuchscherer - please could you provide the JVM options being used to run the application. I'm interested in whether it takes the default settings provided by the memory calculator or if the reserved code cache size has been set smaller than the default (240 Mb on Java 8). This document on code cache tuning may also be useful. |
Here is the start command for the app (I changed the formatting to make it easier to read - originally it's is obviously all one line):
Following JVM params have been set via the
|
@jtuchscherer Thanks very much. It seems the application is using the default reserved code cache size. This may be a red herring, but why is My next step is to try to reproduce the failure. If you have a simple way of reproducing the failure, that would be extremely helpful. |
Cannot reproduce the problem. Setting ReservedCodeCacheSize to a small value results in the expected behaviour rather than a crash:
Neither does reducing |
@glyn Unfortunately there is no easy way to reproduce this yet. I have only seen this behavior on this on-prem PCF installation that I am currently working with. |
Since there's been no response on this issue in a couple of weeks, I'm going to close it. If you'd like to see it re-opened, please comment on the issue and I'll reopen it. |
Environment: We are running into a similar issue and we have no idea as to what resources are exhausted. It sure does not look like we are running out of heap space or ReservedCodeCache but the JVM crashes with the following histogram:
We have tried doubling the ReservedCodeCache and bumping up heap spaces but still run into the exact same crash. I can reproduce it consistently with my app deployed to CF but I am not sure what resources we are running out of. |
Hi, I am facing the same issue. Did anyone have the solution for this issue. I tried increasing the InitialCodeCacheSize, ReservedCodeCacheSize, CodeCacheExpansionSize, CompressedClassSpaceSize. And tried to reduce the thread Stack size. Still getting the issue. |
If @sivabalans, @navanneethan, or anyone else who has encountered this problem can provide a cut-down sample to reproduce the problem, that would be invaluable. I tried to reproduce this problem previously and was unable to. |
I am an OpenJDK developer. We encountered this bug at a customer. I have a patch which would just suppress ResourceExhausted in a CompilerThread (or inside any thread unable to walk the heap) and try to bring this patch upstream, but we are getting bogged down in discussions. Bug report: https://bugs.openjdk.java.net/browse/JDK-8213834 If you guys feel like chiming in on the discussion, feel free to do so. |
P.s. the short version why this happens: the JIT compiler threads occasionally allocate memory from MetaSpace. That has nothing to do with CodeCacheSize etc. If we hit a OOM in Metaspace at that point, ResourceExhausted will be posted from inside the CompilerThread. There is no real way around this. One can reduce the likelihood of that Bug happening by increasing MaxMetaspaceSize. |
Thanks @tstuefe. I posted to the mailing list discussion. (I can't see how to register in order to post to or vote for the bug.) |
Yeah sorry, the JBS is writable only to OpenJDK authors only. |
Pushed to mainline OpenJDK, so it should be fixed for jdk12. I plan to downport this to jdk11u, and if possible to jdk8u. |
Great to know that this will be fixed! Thank you all for working on this. |
Backported to jdk11u: http://hg.openjdk.java.net/jdk-updates/jdk11u/rev/789a020d4afe I'll stop updating this issue now; if there is interest in bringing this fix to jdk8, pls post a short request to hotspot-dev. Cheers, Thomas |
I also faced this with cloud foundry java build pack , my jar was around 90m so I had to increase the XX:MaxMetaspaceSize value to 200 and it fixed it, other params didnt help me. |
Hi, as I understood, the fix solves "no histogram printed" problem, but doesn't solve "crash related to MetaSpace" problem. |
AFAIU (this issue is four years old) it should fix the issue described here, which is a crash due to JVMTI posting the ResourceExhausted in a JIT thread, which causes jvmkill to be invoked, which calls back into the JVM, which is not allowed. |
This information is very helpful. Thank you! @tstuefe |
Environment:
Java Buildpack 4.5.1
IaaS: vSphere
garden-runc: 1.9.0
diego: 1.23.2
cf: 259
capi: 1.28.0
Problem:
Our Application crashes in regular intervals because of resource exhaustion. In the logs, we see the following line:
2017-10-11T17:05:02.25+0200 [APP/PROC/WEB/3]ERR ResourceExhausted! (1/0)
As expected, right afterwards we see the memory histogram. But after that, there should be the Memory usage summary, correct? (At least during my testing on PWS, I see this working beautifully)
In our case, we don't see the memory summary but this error message (order not guaranteed):
After this, the application restarts and goes back to working fine.
Expected outcome
In order to diagnose the memory leak, it would be great to get the memory summary.
Notes
At this point, I assume this might be caused by the jvmkill agent trying to gather the memory data, but this is really just a unfounded suspicion.
The text was updated successfully, but these errors were encountered: