0

I recently jumped onto a project and don't fully understand the mechanics behind it, but I am getting an error with exit code 13 when running spark commands on a yarn cluster.

Here are the output logs:

Application application_1638930204378_0001 failed 2 times due to AM Container for 
appattempt_1638930204378_0001_000002 exited with exitCode: 13
Failing this attempt.Diagnostics: [2021-12-08 03:00:18.078]Exception from container-launch.
Container id: container_e147_1638930204378_0001_02_000001
Exit code: 13
Exception message: Launch container failed
Shell output: main : command provided 1
main : run as user is dv-svc-den-refinitiv
main : requested yarn user is dv-svc-den-refinitiv
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file /data/05/yarn/nm/nmPrivate/application_1638930204378_0001/container_e147_1638930204378_0001_02_000001/container_e147_1638930204378_0001_02_000001.pid.tmp
Writing to cgroup task files...
Creating local dirs...
Launching container...
Getting exit code file...
Creating script paths...
[2021-12-08 03:00:18.098]Container exited with a non-zero exit code 13. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.4-1.cdh7.1.4.p0.6300266/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.4-1.cdh7.1.4.p0.6300266/jars/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
21/12/08 03:00:16 INFO util.SignalUtils: Registered signal handler for TERM
21/12/08 03:00:16 INFO util.SignalUtils: Registered signal handler for HUP
21/12/08 03:00:16 INFO util.SignalUtils: Registered signal handler for INT
21/12/08 03:00:16 INFO spark.SecurityManager: Changing view acls to: dv-svc-den-refinitiv
21/12/08 03:00:16 INFO spark.SecurityManager: Changing modify acls to: dv-svc-den-refinitiv
21/12/08 03:00:16 INFO spark.SecurityManager: Changing view acls groups to: dv-its-hdp-hdfsusers
21/12/08 03:00:16 INFO spark.SecurityManager: Changing modify acls groups to:
21/12/08 03:00:16 INFO spark.SecurityManager: SecurityManager: authentication enabled; ui acls enabled; users with view permissions: Set(dv-svc-den-refinitiv); groups with view permissions: Set(dv-its-hdp-hdfsusers); users with modify permissions: Set(dv-svc-den-refinitiv); groups with modify permissions: Set()
21/12/08 03:00:16 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1638930204378_0001_000002
21/12/08 03:00:16 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread
21/12/08 03:00:16 INFO yarn.ApplicationMaster: Waiting for spark context initialization...
21/12/08 03:00:16 ERROR yarn.ApplicationMaster: User application exited with status 1
21/12/08 03:00:16 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: User application exited with status 1)
21/12/08 03:00:16 ERROR yarn.ApplicationMaster: Uncaught exception:
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:449)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:276)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:812)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:811)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:811)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
Caused by: org.apache.spark.SparkUserAppException: User application exited with 1
at org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:106)
at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:665)
21/12/08 03:00:17 INFO yarn.ApplicationMaster: Deleting staging directory hdfs://nameservice1/user/dv-svc-den-refinitiv/.sparkStaging/application_1638930204378_0001
21/12/08 03:00:17 WARN ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
21/12/08 03:00:17 INFO util.ShutdownHookManager: Shutdown hook called
[2021-12-08 03:00:18.099]Container exited with a non-zero exit code 13. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.4-1.cdh7.1.4.p0.6300266/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.4-1.cdh7.1.4.p0.6300266/jars/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
21/12/08 03:00:16 INFO util.SignalUtils: Registered signal handler for TERM
21/12/08 03:00:16 INFO util.SignalUtils: Registered signal handler for HUP
21/12/08 03:00:16 INFO util.SignalUtils: Registered signal handler for INT
21/12/08 03:00:16 INFO spark.SecurityManager: Changing view acls to: dv-svc-den-refinitiv
21/12/08 03:00:16 INFO spark.SecurityManager: Changing modify acls to: dv-svc-den-refinitiv
21/12/08 03:00:16 INFO spark.SecurityManager: Changing view acls groups to: dv-its-hdp-hdfsusers
21/12/08 03:00:16 INFO spark.SecurityManager: Changing modify acls groups to:
21/12/08 03:00:16 INFO spark.SecurityManager: SecurityManager: authentication enabled; ui acls enabled; users with view permissions: Set(dv-svc-den-refinitiv); groups with view permissions: Set(dv-its-hdp-hdfsusers); users with modify permissions: Set(dv-svc-den-refinitiv); groups with modify permissions: Set()
21/12/08 03:00:16 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1638930204378_0001_000002
21/12/08 03:00:16 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread
21/12/08 03:00:16 INFO yarn.ApplicationMaster: Waiting for spark context initialization...
21/12/08 03:00:16 ERROR yarn.ApplicationMaster: User application exited with status 1
21/12/08 03:00:16 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: User application exited with status 1)
21/12/08 03:00:16 ERROR yarn.ApplicationMaster: Uncaught exception:
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:449)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:276)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:812)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:811)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:811)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
Caused by: org.apache.spark.SparkUserAppException: User application exited with 1
at org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:106)
at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:665)
21/12/08 03:00:17 INFO yarn.ApplicationMaster: Deleting staging directory hdfs://nameservice1/user/dv-svc-den-refinitiv/.sparkStaging/application_1638930204378_0001
21/12/08 03:00:17 WARN ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
21/12/08 03:00:17 INFO util.ShutdownHookManager: Shutdown hook called
For more detailed output, check the application tracking page: https://-------- Then click on links to logs of each attempt.
. Failing the application.

I'm sorry for throwing all these logs in here... Let me know if you need more info, or if anything sticks out!

tprebenda
  • 389
  • 1
  • 6
  • 17
  • 1
    Does this answer your question? [Spark runs on Yarn cluster exitCode=13:](https://stackoverflow.com/questions/36535411/spark-runs-on-yarn-cluster-exitcode-13) Check the comments under the question, and the accepted answer of course. – mazaneicha Dec 18 '21 at 19:02
  • You can have a look at the question answered here. exit code 13[exit code 13](https://stackoverflow.com/questions/36535411/spark-runs-on-yarn-cluster-exitcode-13/36605869#36605869) – Pooja Kashyap Dec 12 '22 at 08:24

0 Answers0