Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JVM Kill Agent not printing summary of memory spaces, Internal JVM Error instead #500

Closed
jtuchscherer opened this issue Oct 17, 2017 · 21 comments
Assignees
Labels
Milestone

Comments

@jtuchscherer
Copy link
Contributor

Environment:

Java Buildpack 4.5.1
IaaS: vSphere
garden-runc: 1.9.0
diego: 1.23.2
cf: 259
capi: 1.28.0

Problem:

Our Application crashes in regular intervals because of resource exhaustion. In the logs, we see the following line:
2017-10-11T17:05:02.25+0200 [APP/PROC/WEB/3]ERR ResourceExhausted! (1/0)

As expected, right afterwards we see the memory histogram. But after that, there should be the Memory usage summary, correct? (At least during my testing on PWS, I see this working beautifully)
In our case, we don't see the memory summary but this error message (order not guaranteed):

2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT #
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT # A fatal error has been detected by the Java Runtime Environment:
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT #
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT #  Internal Error (javaCalls.cpp:53), pid=14, tid=0x00007fceab216700
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT # JRE version: OpenJDK Runtime Environment (8.0_144-b01) (build 1.8.0_144-b01)
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT # Java VM: OpenJDK 64-Bit Server VM (25.144-b01 mixed mode linux-amd64 compressed oops)
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT #
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT # An error report file with more information is saved as:
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT # /home/vcap/app/hs_err_pid14.log
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT #  guarantee(!thread->is_Compiler_thread()) failed: cannot make java calls from the compiler
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT #
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT [thread 140525006722816 also had an error]
2017-10-11T17:05:31.85+0200 [APP/PROC/WEB/3]OUT # Compiler replay data is saved as:
2017-10-11T17:05:31.85+0200 [APP/PROC/WEB/3]OUT # /home/vcap/app/replay_pid14.log
2017-10-11T17:05:31.85+0200 [APP/PROC/WEB/3]OUT #
2017-10-11T17:05:31.85+0200 [APP/PROC/WEB/3]OUT #
2017-10-11T17:05:31.85+0200 [APP/PROC/WEB/3]OUT # If you would like to submit a bug report, please visit:
2017-10-11T17:05:31.85+0200 [APP/PROC/WEB/3]OUT #   http://bugreport.java.com/bugreport/crash.jsp
2017-10-11T17:05:31.85+0200 [APP/PROC/WEB/3]OUT #
2017-10-11T17:05:31.91+0200 [APP/PROC/WEB/3]OUT Exit status 134

After this, the application restarts and goes back to working fine.

Expected outcome

In order to diagnose the memory leak, it would be great to get the memory summary.

Notes

At this point, I assume this might be caused by the jvmkill agent trying to gather the memory data, but this is really just a unfounded suspicion.

@nebhale nebhale added the bug label Oct 17, 2017
@nebhale nebhale added this to the v4.7 milestone Oct 17, 2017
@glyn
Copy link
Contributor

glyn commented Oct 19, 2017

The memory summary is produced by invoking MXBeans using JNI. It seems that the problem here is that jvmkill is being driven under a (JIT) compiler thread and is then unable to make such JNI calls due to a restriction in Hotspot. Unfortunately, unless this restriction is lifted, jvmkill will not be able to produce a memory summary in these circumstances.

I think the best we can do is attempt to diagnose the case where we are driven under a compiler thread and, in that case, print a suitable warning (which may infer something about the kind of memory pool that is being exceeded, i.e. something to do with compilation) and bypass any actions which involve JNI calls, including producing a memory summary.

@glyn
Copy link
Contributor

glyn commented Oct 23, 2017

I can't find any way to determine whether jvmkill is being driven under a compiler thread as the relevant header files are internal to Hotspot and not exposed on JVMTI or JNI.

It seems that the most likely root cause here is (tiered) compilation consuming more code cache than is available. I'm surprised that the resource exhaustion exit is driven: I would hope the compiler would simply stop when it runs out of code cache.

@jtuchscherer - please could you provide the JVM options being used to run the application. I'm interested in whether it takes the default settings provided by the memory calculator or if the reserved code cache size has been set smaller than the default (240 Mb on Java 8).

This document on code cache tuning may also be useful.

@jtuchscherer
Copy link
Contributor Author

@glyn

Here is the start command for the app (I changed the formatting to make it easier to read - originally it's is obviously all one line):

/home/vcap/app/.java-buildpack/open_jdk_jre/bin/java 
  -agentpath:/home/vcap/app/.java-buildpack/open_jdk_jre/bin/jvmkill-1.10.0_RELEASE=printHeapHistogram=1 
  -Djava.io.tmpdir=/home/vcap/tmp 
  -agentpath:/home/vcap/app/.java-buildpack/dynatrace_appmon_agent/agent/lib64/libdtagent.so=name=xxxxxxxxx,server=xxxxxxxxx 
  -Djava.rmi.server.hostname=127.0.0.1 
  -Dcom.sun.management.jmxremote.authenticate=false 
  -Dcom.sun.management.jmxremote.ssl=false 
  -Dcom.sun.management.jmxremote.port=5000 
  -Dcom.sun.management.jmxremote.rmi.port=5000 
  -Djava.ext.dirs=/home/vcap/app/.java-buildpack/container_security_provider:/home/vcap/app/.java-buildpack/open_jdk_jre/lib/ext 
  -Djava.security.properties=/home/vcap/app/.java-buildpack/security_providers/java.security 
  -Dhttps.protocols=TLSv1 
  -Djsse.enableSNIExtension=false 
  -Dlogging.config=classpath:pcf-logback.xml 
  -Dxxxxxx-loglevel=ERROR 
  -XX:+UseConcMarkSweepGC 
  -Dorg.apache.camel.jmx.statisticsLevel=Off 
  -XX:MaxRAM=2560m 
  -XX:MaxDirectMemorySize=10M 
  -XX:MaxMetaspaceSize=177414K 
  -XX:ReservedCodeCacheSize=240M 
  -XX:CompressedClassSpaceSize=27574K 
  -Xss1M 
  -Xmx2115395K 
  -cp /home/vcap/app/. 
  org.springframework.boot.loader.JarLauncher

Following JVM params have been set via the manifest.yml:

  spring.application.name: xxxxxxxxxxxx
    TZ: 'Europe/Berlin'
    JAVA_OPTS: >-
      -Dhttps.protocols=TLSv1
      -Djsse.enableSNIExtension=false
      -Dlogging.config=classpath:pcf-logback.xml
      -Dxxxxxxxxxxxxxxxxx-loglevel=ERROR
      -XX:+UseConcMarkSweepGC
      -Dorg.apache.camel.jmx.statisticsLevel=Off
      -XX:MaxRAM=2560m
    JBP_CONFIG_CONTAINER_CERTIFICATE_TRUST_STORE: '{enabled: true}'
    JBP_CONFIG_DYNA_TRACE_AGENT: '{"version": "6.3.0_+", "repository_root":"http://downloads.dynatracesaas.com/cloudfoundry/buildpack/java"}'
    JBP_CONFIG_JMX: '{enabled: true}'
    DT_LOGLEVELFILE: none
    MALLOC_ARENA_MAX: 2
    SPRING_CLOUD_CONFIG_FAIL_FAST: true
    SPRING_CLOUD_CONFIG_LABEL: xxxxxx
    SPRING_PROFILES_ACTIVE: xxxxxxxxxxxxxxxxx
    health.config.enabled: false

@glyn
Copy link
Contributor

glyn commented Oct 23, 2017

@jtuchscherer Thanks very much. It seems the application is using the default reserved code cache size.

This may be a red herring, but why is -XX:MaxRAM=2560m specified? There is an open issue #494 about the existence of JVM memory areas in addition to those specified above, which may mean that the MaxRAM setting is too small. However, I don't know whether that might cause a compiler thread to run out of memory.

My next step is to try to reproduce the failure. If you have a simple way of reproducing the failure, that would be extremely helpful.

@glyn
Copy link
Contributor

glyn commented Oct 23, 2017

Cannot reproduce the problem.

Setting ReservedCodeCacheSize to a small value results in the expected behaviour rather than a crash:

Java HotSpot(TM) 64-Bit Server VM warning: CodeCache is full. Compiler has been disabled.
Java HotSpot(TM) 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize=
CodeCache: size=3072Kb used=2192Kb max_used=2443Kb free=879Kb
 bounds [0x000000010861c000, 0x000000010891c000, 0x000000010891c000]
 total_blobs=917 nmethods=550 adapters=279
 compilation: disabled (not enough contiguous free space left)

Neither does reducing -XX:MaxRAM reproduce the problem.

@jtuchscherer
Copy link
Contributor Author

@glyn Unfortunately there is no easy way to reproduce this yet. I have only seen this behavior on this on-prem PCF installation that I am currently working with.

@nebhale
Copy link
Member

nebhale commented Nov 14, 2017

Since there's been no response on this issue in a couple of weeks, I'm going to close it. If you'd like to see it re-opened, please comment on the issue and I'll reopen it.

@nebhale nebhale closed this as completed Nov 14, 2017
@sivabalans
Copy link

sivabalans commented Nov 29, 2017

Environment:
JBP - 3.19.1
JDK version 1.8.0_101(Have some dependency due to which we cannot move to a higher version. We are working on it)
API version - 2.75.0
CF - Opensource

We are running into a similar issue and we have no idea as to what resources are exhausted. It sure does not look like we are running out of heap space or ReservedCodeCache but the JVM crashes with the following histogram:

 2017-11-27T16:25:39.58-0800 [APP/PROC/WEB/0] OUT | 282766         | 49997496    | [C                                                                 |
   2017-11-27T16:25:39.58-0800 [APP/PROC/WEB/0] OUT | 239765         | 5754360     | Ljava/util/LinkedList$Node;                                        |
   2017-11-27T16:25:39.57-0800 [APP/PROC/WEB/0] OUT | Instance Count | Total Bytes | Class Name                                                         |
   2017-11-27T16:25:39.58-0800 [APP/PROC/WEB/0] OUT | 211124         | 8444960     | Ljdk/nashorn/internal/runtime/CompiledFunction;                    |
   2017-11-27T16:25:39.58-0800 [APP/PROC/WEB/0] OUT | 130645         | 7316120     | Ljdk/nashorn/internal/runtime/ScriptFunction;                      |
   2017-11-27T16:25:39.58-0800 [APP/PROC/WEB/0] OUT | 279476         | 6707424     | Ljava/lang/String;                                                 |
   2017-11-27T16:25:39.58-0800 [APP/PROC/WEB/0] OUT | 121552         | 3889664     | Ljava/util/HashMap$Node;                                           |
   2017-11-27T16:25:39.58-0800 [APP/PROC/WEB/0] OUT | 159817         | 5114144     | Ljava/util/LinkedList;                                             |
   2017-11-27T16:25:39.59-0800 [APP/PROC/WEB/0] OUT | 8865           | 3761712     | [B                                                                 |
   2017-11-27T16:25:39.59-0800 [APP/PROC/WEB/0] OUT | 117062         | 3745984     | Ljava/util/concurrent/ConcurrentHashMap$Node;                      |
   2017-11-27T16:25:39.59-0800 [APP/PROC/WEB/0] OUT | 49927          | 3195328     | Lcom/newrelic/agent/deps/org/objectweb/asm/Label;                  |
   2017-11-27T16:25:39.59-0800 [APP/PROC/WEB/0] OUT | 54910          | 3074960     | Ljava/util/LinkedHashMap;                                          |
   2017-11-27T16:25:39.58-0800 [APP/PROC/WEB/0] OUT | 135632         | 4340224     | Ljdk/nashorn/internal/runtime/FinalScriptFunctionData;             |
   2017-11-27T16:25:39.58-0800 [APP/PROC/WEB/0] OUT | 63494          | 4256264     | [Ljava/lang/Object;                                                |
   2017-11-27T16:25:39.59-0800 [APP/PROC/WEB/0] OUT | 38506          | 3388528     | Ljava/lang/reflect/Method;                                         |
   2017-11-27T16:25:39.59-0800 [APP/PROC/WEB/0] OUT | 19817          | 2210080     | Ljava/lang/Class;                                                  |
   2017-11-27T16:25:39.60-0800 [APP/PROC/WEB/0] OUT | 52250          | 2090000     | Ljava/util/LinkedHashMap$Entry;                                    |
   2017-11-27T16:25:39.60-0800 [APP/PROC/WEB/0] OUT | 28427          | 1819328     | Ljdk/nashorn/internal/runtime/PropertyMap;                         |
   2017-11-27T16:25:39.60-0800 [APP/PROC/WEB/0] OUT | 49385          | 1975400     | Lcom/newrelic/agent/deps/org/objectweb/asm/tree/VarInsnNode;       |
   2017-11-27T16:25:39.59-0800 [APP/PROC/WEB/0] OUT | 19534          | 2196032     | [Ljava/util/HashMap$Node;                                          |
   2017-11-27T16:25:39.60-0800 [APP/PROC/WEB/0] OUT | 45515          | 2184720     | Lcom/ctc/wstx/io/WstxInputLocation;                                |
   2017-11-27T16:25:39.60-0800 [APP/PROC/WEB/0] OUT | 54579          | 2183160     | Lcom/newrelic/agent/deps/org/objectweb/asm/tree/LabelNode;         |
   2017-11-27T16:25:39.59-0800 [APP/PROC/WEB/0] OUT | 81460          | 2606720     | Ljdk/nashorn/internal/runtime/PropertyHashMap$Element;             |
   2017-11-27T16:25:39.59-0800 [APP/PROC/WEB/0] OUT | 24697          | 2296632     | [I                                                                 |
   2017-11-27T16:25:39.60-0800 [APP/PROC/WEB/0] OUT | 64860          | 1556640     | Ljdk/nashorn/internal/runtime/Specialization;                      |
   2017-11-27T16:25:39.60-0800 [APP/PROC/WEB/0] OUT | 39486          | 1579440     | Ljava/lang/invoke/BoundMethodHandle$Species_LI;                    |
   2017-11-27T16:25:39.60-0800 [APP/PROC/WEB/0] OUT | 36218          | 1448720     | Lcom/newrelic/agent/deps/org/objectweb/asm/tree/InsnNode;          |
   2017-11-27T16:25:39.61-0800 [APP/PROC/WEB/0] OUT | 31655          | 1266200     | Ljava/lang/invoke/BoundMethodHandle$Species_L3;                    |
   2017-11-27T16:25:39.62-0800 [APP/PROC/WEB/0] OUT | 20380          | 815200      | Ljava/lang/invoke/BoundMethodHandle$Species_LL;                    |
   2017-11-27T16:25:39.62-0800 [APP/PROC/WEB/0] OUT | 22915          | 733280      | Lorg/aspectj/weaver/patterns/ExposedState;                         |
   2017-11-27T16:25:39.61-0800 [APP/PROC/WEB/0] OUT | 48720          | 1169280     | Ljava/util/ArrayList;                                              |
   2017-11-27T16:25:39.61-0800 [APP/PROC/WEB/0] OUT | 2150           | 1094880     | [Ljava/util/concurrent/ConcurrentHashMap$Node;                     |
   2017-11-27T16:25:39.60-0800 [APP/PROC/WEB/0] OUT | 26037          | 1458072     | Lcom/newrelic/agent/deps/org/objectweb/asm/tree/MethodInsnNode;    |
   2017-11-27T16:25:39.61-0800 [APP/PROC/WEB/0] OUT | 19681          | 1259584     | Lcom/ge/workflow/ContextAwareTask;                                 |
   2017-11-27T16:25:39.61-0800 [APP/PROC/WEB/0] OUT | 23204          | 1113792     | Lcom/newrelic/agent/deps/org/objectweb/asm/tree/LineNumberNode;    |
   2017-11-27T16:25:39.61-0800 [APP/PROC/WEB/0] OUT | 22915          | 1099920     | Lorg/aspectj/weaver/reflect/ShadowMatchImpl;                       |
   2017-11-27T16:25:39.61-0800 [APP/PROC/WEB/0] OUT | 45461          | 1091064     | Lcom/ge/workflow/ComponentLocation;                                |
   2017-11-27T16:25:39.61-0800 [APP/PROC/WEB/0] OUT | 5047           | 1065712     | [Ljdk/nashorn/internal/runtime/PropertyHashMap$Element;            |
   2017-11-27T16:25:39.62-0800 [APP/PROC/WEB/0] OUT | 19623          | 863544      | [Ljdk/nashorn/internal/runtime/Property;                           |
   2017-11-27T16:25:39.62-0800 [APP/PROC/WEB/0] OUT | 16812          | 672480      | Lcom/ge/workflow/WorkflowStatus$TaskStatus;                        |
   2017-11-27T16:25:39.62-0800 [APP/PROC/WEB/0] OUT | 41915          | 670640      | Ljava/lang/Object;                                                 |
   2017-11-27T16:25:39.62-0800 [APP/PROC/WEB/0] OUT | 45535          | 728560      | Ljava/util/LinkedHashSet;                                          |
   2017-11-27T16:25:39.61-0800 [APP/PROC/WEB/0] OUT | 18836          | 1054816     | Ljdk/nashorn/internal/runtime/AccessorProperty;                    |
   2017-11-27T16:25:39.63-0800 [APP/PROC/WEB/0] OUT | 19037          | 609184      | Ljavax/xml/bind/JAXBElement;                                       |
   2017-11-27T16:25:39.63-0800 [APP/PROC/WEB/0] OUT | 940            | 443680      | Ljdk/nashorn/internal/objects/Global;                              |
   2017-11-27T16:25:39.62-0800 [APP/PROC/WEB/0] OUT | 20902          | 668864      | Ljdk/nashorn/internal/runtime/PropertyHashMap;                     |
   2017-11-27T16:25:39.62-0800 [APP/PROC/WEB/0] OUT | 12757          | 612336      | Ljava/util/HashMap;                                                |
   2017-11-27T16:25:39.63-0800 [APP/PROC/WEB/0] OUT | 15089          | 603560      | Lcom/newrelic/agent/deps/org/objectweb/asm/tree/JumpInsnNode;      |
   2017-11-27T16:25:39.63-0800 [APP/PROC/WEB/0] OUT | 25886          | 568480      | [Ljava/lang/Class;                                                 |
   2017-11-27T16:25:39.63-0800 [APP/PROC/WEB/0] OUT | 25190          | 531392      | [Ljava/lang/String;                                                |
   2017-11-27T16:25:39.63-0800 [APP/PROC/WEB/0] OUT | 15817          | 508128      | [Ljava/lang/invoke/MethodHandle;                                   |
   2017-11-27T16:25:39.64-0800 [APP/PROC/WEB/0] OUT | 8324           | 332960      | Ljdk/nashorn/internal/scripts/JO;                                  |
   2017-11-27T16:25:39.63-0800 [APP/PROC/WEB/0] OUT | 21708          | 520992      | Lcom/ge/workflow/ComponentMetadata;                                |
   2017-11-27T16:25:39.63-0800 [APP/PROC/WEB/0] OUT | 10559          | 506832      | Lcom/newrelic/agent/deps/org/objectweb/asm/tree/FieldInsnNode;     |
   2017-11-27T16:25:39.64-0800 [APP/PROC/WEB/0] OUT | 24685          | 394960      | Ljava/util/concurrent/atomic/AtomicReference;                      |
   2017-11-27T16:25:39.64-0800 [APP/PROC/WEB/0] OUT | 5787           | 324072      | Lcom/ge/workflow/types/MixedScriptType;                            |
   2017-11-27T16:25:39.64-0800 [APP/PROC/WEB/0] OUT | 4215           | 316376      | [J                                                                 |
   2017-11-27T16:25:39.64-0800 [APP/PROC/WEB/0] OUT | 19387          | 310192      | Lcom/ge/workflow/ComponentFactoryProvider$2;                       |
   2017-11-27T16:25:39.63-0800 [APP/PROC/WEB/0] OUT | 3717           | 416304      | Lcom/newrelic/weave/utils/SynchronizedMethodNode;                  |
   2017-11-27T16:25:39.64-0800 [APP/PROC/WEB/0] OUT | 19681          | 314896      | Lcom/ge/workflow/ComponentFactoryProvider$$Lambda$63/1364255968;   |
   2017-11-27T16:25:39.64-0800 [APP/PROC/WEB/0] OUT | 1386           | 296184      | [Z                                                                 |
   2017-11-27T16:25:39.65-0800 [APP/PROC/WEB/0] OUT | 4742           | 227616      | Ljava/lang/invoke/BoundMethodHandle$Species_L3I;                   |
   2017-11-27T16:25:39.64-0800 [APP/PROC/WEB/0] OUT | 6125           | 392000      | Ljava/util/concurrent/ConcurrentHashMap;                           |
   2017-11-27T16:25:39.65-0800 [APP/PROC/WEB/0] OUT | 11393          | 273432      | Lcom/ge/workflow/EvaluableExpression;                              |
   2017-11-27T16:25:39.65-0800 [APP/PROC/WEB/0] OUT | 11161          | 267864      | Ljavax/xml/namespace/QName;                                        |
   2017-11-27T16:25:39.65-0800 [APP/PROC/WEB/0] OUT | 4706           | 263536      | Ljdk/nashorn/internal/runtime/UserAccessorProperty;                |
   2017-11-27T16:25:39.65-0800 [APP/PROC/WEB/0] OUT | 6060           | 242400      | Lcom/newrelic/agent/deps/org/objectweb/asm/tree/LdcInsnNode;       |
   2017-11-27T16:25:39.65-0800 [APP/PROC/WEB/0] OUT | 5980           | 239200      | Lcom/newrelic/agent/deps/org/objectweb/asm/tree/TypeInsnNode;      |
   2017-11-27T16:25:39.65-0800 [APP/PROC/WEB/0] OUT | 9809           | 235416      | Lcom/ge/workflow/EvaluableTextExpression;                          |
   2017-11-27T16:25:39.65-0800 [APP/PROC/WEB/0] OUT | 9427           | 226248      | Lcom/newrelic/agent/deps/org/objectweb/asm/Edge;                   |
   2017-11-27T16:25:39.64-0800 [APP/PROC/WEB/0] OUT | 6077           | 291696      | Lcom/newrelic/agent/deps/org/objectweb/asm/Frame;                  |
   2017-11-27T16:25:39.65-0800 [APP/PROC/WEB/0] OUT | 6580           | 263200      | Ljdk/nashorn/internal/runtime/NativeJavaPackage;                   |
   2017-11-27T16:25:39.66-0800 [APP/PROC/WEB/0] OUT | 2235           | 214560      | Lorg/springframework/beans/GenericTypeAwarePropertyDescriptor;     |
   2017-11-27T16:25:39.66-0800 [APP/PROC/WEB/0] OUT | 4968           | 198720      | Ljdk/nashorn/internal/objects/NativeArray;                         |
   2017-11-27T16:25:39.66-0800 [APP/PROC/WEB/0] OUT | 5291           | 211640      | Ljava/util/WeakHashMap$Entry;                                      |
   2017-11-27T16:25:39.66-0800 [APP/PROC/WEB/0] OUT | 5090           | 203600      | Lcom/newrelic/agent/deps/org/objectweb/asm/tree/LocalVariableNode; |
   2017-11-27T16:25:39.66-0800 [APP/PROC/WEB/0] OUT | 2235           | 196680      | Lcom/ge/workflow/types/HttpRequestType;                            |
   2017-11-27T16:25:39.66-0800 [APP/PROC/WEB/0] OUT | 2380           | 190400      | Ljava/net/URI;                                                     |
   2017-11-27T16:25:39.66-0800 [APP/PROC/WEB/0] OUT | 4319           | 172760      | Lcom/fasterxml/jackson/databind/introspect/AnnotatedMethod;        |
   2017-11-27T16:25:39.66-0800 [APP/PROC/WEB/0] OUT | 9244           | 221856      | Lorg/springframework/core/MethodClassKey;                          |
   2017-11-27T16:25:39.66-0800 [APP/PROC/WEB/0] OUT | 6922           | 221504      | Ljava/lang/ref/WeakReference;                                      |
   2017-11-27T16:25:39.67-0800 [APP/PROC/WEB/0] OUT | 2235           | 143040      | Lcom/ge/workflow/HttpRequestTask$Builder;                          |
   2017-11-27T16:25:39.67-0800 [APP/PROC/WEB/0] OUT | 3580           | 143200      | Lcom/newrelic/agent/deps/org/objectweb/asm/tree/IntInsnNode;       |
   2017-11-27T16:25:39.67-0800 [APP/PROC/WEB/0] OUT | 6580           | 157920      | Ljdk/nashorn/internal/runtime/UserAccessorProperty$Accessors;      |
   2017-11-27T16:25:39.67-0800 [APP/PROC/WEB/0] OUT | 2646           | 148176      | Ljava/lang/invoke/MemberName;                                      |
   2017-11-27T16:25:39.67-0800 [APP/PROC/WEB/0] OUT | 2624           | 125952      | Ljava/lang/invoke/MethodHandleImpl$CountingWrapper;                |
   2017-11-27T16:25:39.67-0800 [APP/PROC/WEB/0] OUT | 1724           | 124128      | Ljava/lang/reflect/Field;                                          |
   2017-11-27T16:25:39.68-0800 [APP/PROC/WEB/0] OUT | 5062           | 121488      | Ljava/beans/MethodRef;                                             |
   2017-11-27T16:25:39.67-0800 [APP/PROC/WEB/0] OUT | 4131           | 165240      | Ljava/util/TreeMap$Entry;                                          |
   2017-11-27T16:25:39.67-0800 [APP/PROC/WEB/0] OUT | 4775           | 152800      | Ljdk/nashorn/internal/runtime/linker/LinkerCallSite;               |
   2017-11-27T16:25:39.68-0800 [APP/PROC/WEB/0] OUT | 1672           | 120384      | Lorg/springframework/core/annotation/AnnotationAttributes;         |
   2017-11-27T16:25:39.68-0800 [APP/PROC/WEB/0] OUT | 940            | 120320      | Ljdk/nashorn/internal/objects/NativeString$Prototype;              |
   2017-11-27T16:25:39.68-0800 [APP/PROC/WEB/0] OUT | 4968           | 119232      | Lcom/ge/workflow/extension/sql/task/Param;                         |
   2017-11-27T16:25:39.68-0800 [APP/PROC/WEB/0] OUT | 3717           | 118944      | Lcom/newrelic/agent/deps/org/objectweb/asm/tree/InsnList;          |
   2017-11-27T16:25:39.68-0800 [APP/PROC/WEB/0] OUT | 2070           | 115920      | Lcom/ge/workflow/extension/sql/type/SqlType;                       |
   2017-11-27T16:25:39.68-0800 [APP/PROC/WEB/0] OUT | 940            | 120320      | Ljdk/nashorn/internal/objects/NativeArray$Prototype;               |
   2017-11-27T16:25:39.68-0800 [APP/PROC/WEB/0] OUT | 840            | 114240      | Lorg/apache/openjpa/jdbc/meta/ValueMappingImpl;                    |
   2017-11-27T16:25:39.69-0800 [APP/PROC/WEB/0] OUT | 940            | 112800      | Ljdk/nashorn/internal/objects/NativeObject$Constructor;            |
   2017-11-27T16:25:39.68-0800 [APP/PROC/WEB/0] OUT | 2063           | 115528      | Lcom/ge/workflow/types/RunType;                                    |
   2017-11-27T16:25:39.69-0800 [APP/PROC/WEB/0] OUT #
   2017-11-27T16:25:39.69-0800 [APP/PROC/WEB/0] OUT | 1729           | 110656      | Lorg/springframework/core/MethodParameter;                         |
   2017-11-27T16:25:39.69-0800 [APP/PROC/WEB/0] OUT | 3451           | 110432      | Ljdk/nashorn/internal/runtime/arrays/ObjectArrayData;              |
   2017-11-27T16:25:39.69-0800 [APP/PROC/WEB/0] OUT #  Internal Error (javaCalls.cpp:53), pid=9, tid=0x00007f01b84e7700
   2017-11-27T16:25:39.69-0800 [APP/PROC/WEB/0] OUT #  guarantee(!thread->is_Compiler_thread()) failed: cannot make java calls from the compiler
   2017-11-27T16:25:39.69-0800 [APP/PROC/WEB/0] OUT #
   2017-11-27T16:25:39.69-0800 [APP/PROC/WEB/0] OUT # A fatal error has been detected by the Java Runtime Environment:
   2017-11-27T16:25:39.69-0800 [APP/PROC/WEB/0] OUT # JRE version: OpenJDK Runtime Environment (8.0_101-b13) (build 1.8.0_101-b13)
   2017-11-27T16:25:39.69-0800 [APP/PROC/WEB/0] OUT # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
   2017-11-27T16:25:39.69-0800 [APP/PROC/WEB/0] OUT #
   2017-11-27T16:25:39.69-0800 [APP/PROC/WEB/0] OUT # /home/vcap/app/hs_err_pid9.log
   2017-11-27T16:25:39.69-0800 [APP/PROC/WEB/0] OUT #
   2017-11-27T16:25:39.69-0800 [APP/PROC/WEB/0] OUT # An error report file with more information is saved as:
   2017-11-27T16:25:39.69-0800 [APP/PROC/WEB/0] OUT # Java VM: OpenJDK 64-Bit Server VM (25.101-b13 mixed mode linux-amd64 compressed oops)
   2017-11-27T16:25:39.72-0800 [APP/PROC/WEB/0] OUT #
   2017-11-27T16:25:39.72-0800 [APP/PROC/WEB/0] OUT # Compiler replay data is saved as:
   2017-11-27T16:25:39.72-0800 [APP/PROC/WEB/0] OUT # If you would like to submit a bug report, please visit:
   2017-11-27T16:25:39.72-0800 [APP/PROC/WEB/0] OUT # /home/vcap/app/replay_pid9.log
   2017-11-27T16:25:39.72-0800 [APP/PROC/WEB/0] OUT #
   2017-11-27T16:25:39.72-0800 [APP/PROC/WEB/0] OUT #
   2017-11-27T16:25:39.72-0800 [APP/PROC/WEB/0] OUT #   http://bugreport.java.com/bugreport/crash.jsp

We have tried doubling the ReservedCodeCache and bumping up heap spaces but still run into the exact same crash. I can reproduce it consistently with my app deployed to CF but I am not sure what resources we are running out of.

@navanneethan
Copy link

navanneethan commented Jan 22, 2018

Hi,

I am facing the same issue. Did anyone have the solution for this issue. I tried increasing the InitialCodeCacheSize, ReservedCodeCacheSize, CodeCacheExpansionSize, CompressedClassSpaceSize. And tried to reduce the thread Stack size. Still getting the issue.

@glyn
Copy link
Contributor

glyn commented Jan 22, 2018

If @sivabalans, @navanneethan, or anyone else who has encountered this problem can provide a cut-down sample to reproduce the problem, that would be invaluable. I tried to reproduce this problem previously and was unable to.

@tstuefe
Copy link

tstuefe commented Nov 21, 2018

I am an OpenJDK developer. We encountered this bug at a customer. I have a patch which would just suppress ResourceExhausted in a CompilerThread (or inside any thread unable to walk the heap) and try to bring this patch upstream, but we are getting bogged down in discussions.

Bug report: https://bugs.openjdk.java.net/browse/JDK-8213834
Mailing list discussion - still no consensus how to handle this: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2018-November/031107.html

If you guys feel like chiming in on the discussion, feel free to do so.

@tstuefe
Copy link

tstuefe commented Nov 21, 2018

P.s. the short version why this happens: the JIT compiler threads occasionally allocate memory from MetaSpace. That has nothing to do with CodeCacheSize etc. If we hit a OOM in Metaspace at that point, ResourceExhausted will be posted from inside the CompilerThread.

There is no real way around this. One can reduce the likelihood of that Bug happening by increasing MaxMetaspaceSize.

@glyn
Copy link
Contributor

glyn commented Nov 21, 2018

Thanks @tstuefe. I posted to the mailing list discussion. (I can't see how to register in order to post to or vote for the bug.)

@tstuefe
Copy link

tstuefe commented Nov 21, 2018

Yeah sorry, the JBS is writable only to OpenJDK authors only.

@tstuefe
Copy link

tstuefe commented Nov 29, 2018

Pushed to mainline OpenJDK, so it should be fixed for jdk12. I plan to downport this to jdk11u, and if possible to jdk8u.

https://bugs.openjdk.java.net/browse/JDK-8213834

@jtuchscherer
Copy link
Contributor Author

Great to know that this will be fixed! Thank you all for working on this.

@tstuefe
Copy link

tstuefe commented Dec 5, 2018

Backported to jdk11u: http://hg.openjdk.java.net/jdk-updates/jdk11u/rev/789a020d4afe

I'll stop updating this issue now; if there is interest in bringing this fix to jdk8, pls post a short request to hotspot-dev.

Cheers, Thomas

@bhishekarora
Copy link

Hi,

I am facing the same issue. Did anyone have the solution for this issue. I tried increasing the InitialCodeCacheSize, ReservedCodeCacheSize, CodeCacheExpansionSize, CompressedClassSpaceSize. And tried to reduce the thread Stack size. Still getting the issue.

I also faced this with cloud foundry java build pack , my jar was around 90m so I had to increase the XX:MaxMetaspaceSize value to 200 and it fixed it, other params didnt help me.

@berkkonak
Copy link

P.s. the short version why this happens: the JIT compiler threads occasionally allocate memory from MetaSpace. That has nothing to do with CodeCacheSize etc. If we hit a OOM in Metaspace at that point, ResourceExhausted will be posted from inside the CompilerThread.

There is no real way around this. One can reduce the likelihood of that Bug happening by increasing MaxMetaspaceSize.

Hi, as I understood, the fix solves "no histogram printed" problem, but doesn't solve "crash related to MetaSpace" problem.
Just to be clear, did I understand right? Thanks in advance.

@tstuefe
Copy link

tstuefe commented Jul 27, 2023

P.s. the short version why this happens: the JIT compiler threads occasionally allocate memory from MetaSpace. That has nothing to do with CodeCacheSize etc. If we hit a OOM in Metaspace at that point, ResourceExhausted will be posted from inside the CompilerThread.
There is no real way around this. One can reduce the likelihood of that Bug happening by increasing MaxMetaspaceSize.

Hi, as I understood, the fix solves "no histogram printed" problem, but doesn't solve "crash related to MetaSpace" problem. Just to be clear, did I understand right? Thanks in advance.

AFAIU (this issue is four years old) it should fix the issue described here, which is a crash due to JVMTI posting the ResourceExhausted in a JIT thread, which causes jvmkill to be invoked, which calls back into the JVM, which is not allowed.

@berkkonak
Copy link

P.s. the short version why this happens: the JIT compiler threads occasionally allocate memory from MetaSpace. That has nothing to do with CodeCacheSize etc. If we hit a OOM in Metaspace at that point, ResourceExhausted will be posted from inside the CompilerThread.
There is no real way around this. One can reduce the likelihood of that Bug happening by increasing MaxMetaspaceSize.

Hi, as I understood, the fix solves "no histogram printed" problem, but doesn't solve "crash related to MetaSpace" problem. Just to be clear, did I understand right? Thanks in advance.

AFAIU (this issue is four years old) it should fix the issue described here, which is a crash due to JVMTI posting the ResourceExhausted in a JIT thread, which causes jvmkill to be invoked, which calls back into the JVM, which is not allowed.

This information is very helpful. Thank you! @tstuefe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants