2

I ran the following command:

$ spark-submit --master yarn --deploy-mode cluster pi.py

So, below log is continuous print:

...
2021-12-23 06:07:50,158 INFO yarn.Client: Application report for application_1640239254568_0002 (state: ACCEPTED)
2021-12-23 06:07:51,162 INFO yarn.Client: Application report for application_1640239254568_0002 (state: ACCEPTED)
...

and I check the result through my 8088(Logs for container web UI), but there is nothing in stdout.

I was disappointed and tried to force the park operation to end, but suddenly the new log is print like below:

...
2021-12-23 06:09:06,694 INFO yarn.Client: Application report for application_1640239254568_0002 (state: RUNNING)
2021-12-23 06:09:06,695 INFO yarn.Client:
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: master
         ApplicationMaster RPC port: 40451
         queue: default
         start time: 1640239668020
         final status: UNDEFINED
         tracking URL: http://master2:8088/proxy/application_1640239254568_0002/
         user: root
2021-12-23 06:09:07,707 INFO yarn.Client: Application report for application_1640239254568_0002 (state: RUNNING)
...

And after some time, an error log occurred as shown below:

...
2021-12-23 06:10:25,003 INFO retry.RetryInvocationHandler: java.io.EOFException: End of File Exception between local host is: "master/172.17.0.2"; destination host is: "master2":8032; : java.io.EOFException; For more details see:  http://wiki.apache.org/hadoop/EOFException, while invoking ApplicationClientProtocolPBClientImpl.getApplicationReport over rm2. Trying to failover immediately.
2021-12-23 06:10:25,003 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm1
2021-12-23 06:10:25,004 INFO retry.RetryInvocationHandler: java.net.ConnectException: Call From master/172.17.0.2 to master:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused, while invoking ApplicationClientProtocolPBClientImpl.getApplicationReport over rm1 after 1 failover attempts. Trying to failover after sleeping for 18340ms.
2021-12-23 06:10:43,347 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
...

I understand that sparks have completed the resource manager allocation after work, so it is normal for the above error log to appear.

  • Q1. Is the above job normal?
  • Q2. After this work, where can I check the results? Can I check them on "containerlogs web UI"?

IMPORTANT!! ADD. I re-ran the command. and check the status: SUCCEEDED. Why does the park-submit operation sometimes succeed and sometimes stop in the middle?

SecY
  • 307
  • 4
  • 12

0 Answers0