I'm using the K3s distribution of Kubernetes which is deployed on a Spot EC2 Instance in AWS.
I have scheduled a certain processing job and sometimes this job is being terminated and becomes in "Unknown" state (the job code is abnormally terminated)
kubectl describe pod <pod_name>
it shows this:
State: Terminated
Reason: Unknown
Exit Code: 255
Started: Wed, 06 Jan 2021 21:13:29 +0000
Finished: Wed, 06 Jan 2021 23:33:46 +0000
The AWS logs show that the CPU consumption was 99% right before the crash. From number of sources (1, 2, 3) I saw that this can be a reason of a node crash but didn't see that one, What may be the reason?
Thanks!