3

I have a spring boot application which is crashing on cloud foundry with no evident logs for crash. The application has 3 instances and any of the three instances crash some times twice a day and some times once in two days. There is no defined pattern for the crash.

I have tried adding the following java params with the results as mentioned : -XX:ErrorFile : No file got created on error -XX:+HeapDumpOnOutOfMemoryError : Heap dump is created when the instance crashes.

Heap dump is created when instance crashes but there are no OOM logs.

I also tried adding embedded tomcat logs for spring boot application with the following packages added : org.apache.tomcat, org.apache.catalina, org.apache.coyote. Tried creating an OOM locally in docker and could see that OOM log is coming in the tomcat logs for the application.

Just to clarify, The problem is how to find which component of memory is responsible for OOM?

vikas
  • 31
  • 1
  • 4
  • An APM tool would probably be the best to get an insight into the memory areas of your JVM during runtime without the need to deal with the intricacies of a heap dump. Alternatively, you could try JConsole via SSH tunnel to your app container which also works very well on Cloud Foundry. – Tim Gerlach Sep 29 '19 at 20:18

2 Answers2

3

When you're running your Java applications on Cloud Foundry, using -XX:+HeapDumpOnOutOfMemoryError isn't going to help. Adding that option would trigger a heap dump, but it's going to be written inside your app container. As soon as that finishes, the container is going to go away and you won't be able to get the file that was written.

To make this work on Cloud Foundry, the Java buildpack provides some assistance.

  1. The Java buildpack configures a killagent that gets added to the JVM. This agent will execute when there is an OutOfMemoryError. It will print a histogram of the memory usage and it will also print a memory summary. You will see these in the output of cf logs.

    Ex:

    Resource exhaustion event: the JVM was unable to allocate memory from the heap.
    ResourceExhausted! (1/0)
    | Instance Count | Total Bytes | Class Name                                    |
    | 18273          | 313157136   | [B                                            |
    | 47806          | 7648568     | [C                                            |
    | 14635          | 1287880     | Ljava/lang/reflect/Method;                    |
    | 46590          | 1118160     | Ljava/lang/String;                            |
    | 8413           | 938504      | Ljava/lang/Class;                             |
    | 28573          | 914336      | Ljava/util/concurrent/ConcurrentHashMap$Node; |
    

    and

     Memory usage:
       Heap memory: init 65011712, used 332392888, committed 351797248, max 351797248
       Non-heap memory: init 2555904, used 63098592, committed 64815104, max 377790464
    Memory pool usage:
       Code Cache: init 2555904, used 14702208, committed 15007744, max 251658240
       PS Eden Space: init 16252928, used 84934656, committed 84934656, max 84934656
       PS Survivor Space: init 2621440, used 0, committed 19398656, max 19398656
       Compressed Class Space: init 0, used 5249512, committed 5505024, max 19214336
       Metaspace: init 0, used 43150616, committed 44302336, max 106917888
       PS Old Gen: init 43515904, used 247459792, committed 247463936, max 247463936
    

    All Java apps get this when run using the Java buildpack on Cloud Foundry, and it can be helpful for understanding memory usage when your application crashes. If you don't see this then your app crashed for some other reason (see below).

  2. If you need more insight into the memory usage, you can get a full heap dump. To do this, you need to bind persistent storage to your app. If you bind a volume service to your app where the name of the service contains heap-dump, then the Java buildpack will set up this storage to automatically be used to capture heap dumps.

    If a Volume Service with the string heap-dump in its name or tag is bound to the application, terminal heap dumps will be written with the pattern <CONTAINER_DIR>/<SPACE_NAME>-<SPACE_ID[0,8]>/<APPLICATION_NAME>-<APPLICATION_ID[0,8]>/<INSTANCE_INDEX>-<TIMESTAMP>-<INSTANCE_ID[0,8]>.hprof


If you are not seeing the output from the JVM killagent or you're not seeing heap dumps generated to your persistent storage:

  1. Check that the JVM agent has been added. When the buildpack runs during staging, you should see the killagent downloaded and installed.
  2. It should also show up in the start command that is generated by the Java buildpack. Check the start command to make sure you see the agent. Please note, it won't be possible for the buildpack to add the agent if you are specifying your own start command using cf push -c.
  3. You are just not experiencing an OutOfMemoryError. It is possible that your app is crashing for some other reason. The most common would be that you're exceeding the container's memory limit, not the JVM's memory limit. In this case, the container will immediately crash and you will not get output from the killagent.

Hope that helps!

Daniel Mikusa
  • 13,716
  • 1
  • 22
  • 28
  • 1
    1) JVM Kill Agent is available from Cloud Foundry version 4.0. We are using version 3.x. 2) I am able to get the heap dump using the options -XX:HeapDumpPath but the heap dump doesn't give much information. 3) Is there a way to debug the container crash? – vikas Sep 27 '19 at 21:28
  • There is also the Java Memory Assistant. This can do similar things and is in both 3.x and 4.x -> https://github.com/cloudfoundry/java-buildpack/blob/master/docs/framework-java_memory_assistant.md. You really, really shouldn't be using 3.x any more. 4.x is so much better and 3.x is getting quite old as are the dependencies it bundles. – Daniel Mikusa Oct 02 '19 at 13:12
  • HI @DanielMikusa, Regarding point 3, how do i differentiate if it was container's memory limit exceeding, not the JVM's memory limit?. – Tejas Shastri Jan 29 '21 at 07:45
  • Look at `cf events` for your app. If the container crashed, you will see it in the event. If it is a JVM resource exhaustion event, you won't. You'll still see a crash entry in `cf events` but it won't indicate the cause as memory, it'll just say the app exited. This is because CF doesn't know what the JVM is doing, all it sees is the process exits. – Daniel Mikusa Jan 29 '21 at 18:50
0

By default the heap dump is created in a file called java_pidpid.hprof in the working directory of the VM. You can specify an alternative file name or directory with the -XX:HeapDumpPath= option. See https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/clopts001.html for Java hot stop settings

MrKulli
  • 735
  • 10
  • 19
  • 1
    The problem is not the location of heap dump. I am already getting the heap dump with the current option. The problem is that I don't have any log corresponding to memory exhaustion. – vikas Sep 27 '19 at 18:16
  • If you have GC logs, you can use any gc analyzer ( ex : https://gceasy.io/) to identify which code/object causing OOM errors. – MrKulli Sep 27 '19 at 18:35