After executing spark-submit
command in kubernetes in cluster mode ( --deploy-mode cluster
), it always give exit code as 0 (success) even when the driver pod has failed. Ideally, the main pod should fail (i.e. go to state 'Error') as well if the application fails.
However, this issue does not occur in client deploy mode. In client mode, driver pod is not spawned and the application is executed in main pod itself. As a result, the main pod exits with actual exit code. So, if the application fails then the main pod fails and it goes to state 'Error'.
In cluster mode, observation is as follows when I executed a sample workflow to test this behaviour:
- a pod comes up with name
<workflowname>-<uniqueId>
(sayhtest-3668387602
) - another pod is spawned with name
<sparkAppName>-<uniqueId>-driver
(ex:hgigTest-7b241f8-driver
). In this pod, application script is executed. - Let's say application script fails and exits with exit code '12'
- Driver pod
hgigTest-7b241f8-driver
goes to stateError
(this is expected as application script exits with non success code) - However, main pod
htest-3668387602
finishes with stateCompleted
(i.e. success state). - Upon checking the exit code of main pod, it shows as 0 (i.e. success) where as it should be 12 (same as that of driver pod)
Implications:
- Due to this issue, one cannot deduce whether the process was actually successful or not.
- In argo workflow, workflow is not stopped (i.e. further steps are still executed) even when
spark-submit
command fails (in cluster mode as mentioned above)
Following is the relevant part of workflow... Here in the args
attribute, I have appended commands further to print exit code.
# Sample K8s workflow (relevant part)
....
- name: task_template_1
inputs:
parameters:
- name: abcd
container:
env:
- name: EnvVar1
value: 1234
image: 9919222323.dkr.ecr.us-east-1.amazonaws.com/gitlab/asdfg/image1:{{workflow.parameters.imageVersion}}
volumeMounts:
- mountPath: /home/app/.aws
name: aws-creds
readOnly: true
command: [sh, -c, ]
args: [
" (/opt/spark/bin/spark-submit \
--master k8s://https://kubernetes.default.svc \
--deploy-mode cluster \
--conf spark.driverEnv.HTTP2_DISABLE=true \
--conf spark.executorEnv.HTTP2_DISABLE=true \
--conf spark.driverEnv.KUBERNETES_TLS_VERSIONS='TLSv1.2,TLSv1.3' \
--conf spark.executorEnv.KUBERNETES_TLS_VERSIONS='TLSv1.2,TLSv1.3' \
--conf spark.kubernetes.namespace=racenv-dr-pps \
--conf spark.kubernetes.container.image=9919222323.dkr.ecr.us-east-1.amazonaws.com/gitlab/asdfg/image1:{{workflow.parameters.imageVersion}} \
--conf spark.jars.ivy=/tmp/.ivy \
--conf spark.hadoop.fs.s3a.server-side-encryption.enabled=true \
--conf spark.hadoop.fs.s3a.server-side-encryption-algorithm=SSE-KMS \
--conf spark.hadoop.fs.s3a.server-side-encryption.key=alias/pod/racenv \
--conf spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.DefaultAWSCredentialsProviderChain \
--conf spark.kubernetes.driver.podTemplateFile=/opt/application/driver.yaml \
--conf spark.kubernetes.executor.podTemplateFile=/opt/application/executor.yaml \
--conf spark.driver.extraJavaOptions=-Dlog4jspark.root.logger=WARN,console
--conf spark.app.name=hgigTest \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=dr-spark-submit \
--conf spark.executor.memory=10g \
--conf spark.executor.instances=4 \
--conf spark.driver.memory=10g \
--conf spark.executor.cores=8 \
--conf spark.driver.cores=8 \
--conf spark.kubernetes.executor.limit.cores=4 \
--conf spark.kubernetes.executor.request.cores=3 \
--conf spark.kubernetes.driver.request.cores=3 \
--conf spark.kubernetes.driver.limit.cores=4 \
--conf spark.kubernetes.executor.limit.memory=6g \
--conf spark.kubernetes.executor.request.memory=4g \
--conf spark.kubernetes.driver.limit.memory=6g \
--conf spark.kubernetes.driver.request.memory=4g \
--conf spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension \
--conf spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog \
--jars local:///opt/spark/jars/aws-java-sdk-bundle-1.11.271.jar,local:///opt/spark/jars/hadoop-aws-3.2.0.jar,local:///opt/spark/jars/delta-core_2.12-1.0.0.jar \
local:///opt/application/ingr.py --sleepInSec {{workflow.parameters.sleepInSec}} --count {{workflow.parameters.count}}); \
exit_code_p1=$? ; \
echo \"exit_code_p1 is ${exit_code_p1} \"; \
exit $exit_code_p1
"
]
...
Sample end logs of main pod htest-3668387602
:
.......
22/08/03 08:30:35 INFO LoggingPodStatusWatcherImpl: Application status for spark-fad64a94a39bdb0 (phase: Failed)
22/08/03 08:30:35 INFO LoggingPodStatusWatcherImpl: Container final statuses:
container name: hing
container image: 9919222323.dkr.ecr.us-east-1.amazonaws.com/gitlab/asdfg/image1:testcode-00
container state: terminated
container started at: 2022-08-03T08:28:47Z
container finished at: 2022-08-03T08:30:28Z
exit code: 12
termination reason: Error
22/08/03 08:30:35 INFO LoggingPodStatusWatcherImpl: Application hingTest with submission ID racenv-dr-pps:hgigTest-7b241f8-driver finished
22/08/03 08:30:35 INFO ShutdownHookManager: Shutdown hook called
22/08/03 08:30:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-1c1ef2b1-9871-1bec-91f8-5a8662
exit_code_p1 is 0
Note here that exit code printed is 0 (success) where as the log shows that driver pod fails with exit code 12 (error).
How to make sure that if application fails in cluster mode, then main application pod should also fail i.e. if the driver pod fails then main pod should fail too ?