5

I need to persist the heap dump when the java process gets OOM and the pod is restarted.

I have following added in the jvm args

-XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/dumps

...and emptydir is mounted on the same path.

But the issue is if the pod gets restarted and if it gets scheduled on a different node, then we are losing the heap dump. How do I persist the heap dump even if the pod is scheduled to a different node?

We are using AWS EKS and we are having more than 1 replica for the pod.

Could anyone help with this, please?

Ivan Aracki
  • 4,861
  • 11
  • 59
  • 73
Baitanik
  • 63
  • 1
  • 7
  • Hi Baitanik, does the proposed by Allan Chua solution to use EFS solve your issue? – mozello Feb 16 '22 at 08:31
  • Consider [accepting](https://stackoverflow.com/help/accepted-answer) the answer if it solves your issue. – mozello Feb 16 '22 at 08:38
  • 1
    How about using [awsElasticBlockStore](https://kubernetes.io/docs/concepts/storage/volumes/#awselasticblockstore)? The contents of an EBS volume are persisted and the volume is unmounted when a pod is removed. – mozello Feb 17 '22 at 16:31
  • Yeah.. this option we are checking actually.. unless some access issue in production.. will accept this as answer – Baitanik Feb 18 '22 at 05:08

2 Answers2

1

You will have to persists the heap dumps on a shared network location between the pods. In order to achieve this, you will need to provide persistent volume claims and in EKS, this could be achieved using an Elastic File System mounted on different availability zones. You can start learning about it by reading this guide about EFS-based PVCs.

Allan Chua
  • 9,305
  • 9
  • 41
  • 61
  • 1
    We had thought of efs.. but heap dumps are generally 6gb or 8gb size..as writing to efs is slow, before the heap dump is written to efs, the container restarts due to liveness probe.. Just to take heap dump we can not increase liveness probe to high number.. is there any other possible solution? – Baitanik Feb 17 '22 at 03:19
1

As writing to EFS is too slow in your case, there is another option for AWS EKS - awsElasticBlockStore.

The contents of an EBS volume are persisted and the volume is unmounted when a pod is removed. This means that an EBS volume can be pre-populated with data, and that data can be shared between pods.

Note: You must create an EBS volume by using aws ec2 create-volume or the AWS API before you can use it.

There are some restrictions when using an awsElasticBlockStore volume:

  • the nodes on which pods are running must be AWS EC2 instances
  • those instances need to be in the same region and availability zone as the EBS volume
  • EBS only supports a single EC2 instance mounting a volume

Check the official k8s documentation page on this topic, please. And How to use persistent storage in EKS.

mozello
  • 1,083
  • 3
  • 8