Is this an error that occurred after the spark operation?

Question

I ran the following command:

$ spark-submit --master yarn --deploy-mode cluster pi.py

So, below log is continuous print:

...
2021-12-23 06:07:50,158 INFO yarn.Client: Application report for application_1640239254568_0002 (state: ACCEPTED)
2021-12-23 06:07:51,162 INFO yarn.Client: Application report for application_1640239254568_0002 (state: ACCEPTED)
...

and I check the result through my 8088(Logs for container web UI), but there is nothing in stdout.

I was disappointed and tried to force the park operation to end, but suddenly the new log is print like below:

...
2021-12-23 06:09:06,694 INFO yarn.Client: Application report for application_1640239254568_0002 (state: RUNNING)
2021-12-23 06:09:06,695 INFO yarn.Client:
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: master
         ApplicationMaster RPC port: 40451
         queue: default
         start time: 1640239668020
         final status: UNDEFINED
         tracking URL: http://master2:8088/proxy/application_1640239254568_0002/
         user: root
2021-12-23 06:09:07,707 INFO yarn.Client: Application report for application_1640239254568_0002 (state: RUNNING)
...

And after some time, an error log occurred as shown below:

...
2021-12-23 06:10:25,003 INFO retry.RetryInvocationHandler: java.io.EOFException: End of File Exception between local host is: "master/172.17.0.2"; destination host is: "master2":8032; : java.io.EOFException; For more details see:  http://wiki.apache.org/hadoop/EOFException, while invoking ApplicationClientProtocolPBClientImpl.getApplicationReport over rm2. Trying to failover immediately.
2021-12-23 06:10:25,003 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm1
2021-12-23 06:10:25,004 INFO retry.RetryInvocationHandler: java.net.ConnectException: Call From master/172.17.0.2 to master:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused, while invoking ApplicationClientProtocolPBClientImpl.getApplicationReport over rm1 after 1 failover attempts. Trying to failover after sleeping for 18340ms.
2021-12-23 06:10:43,347 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
...

I understand that sparks have completed the resource manager allocation after work, so it is normal for the above error log to appear.

Q1. Is the above job normal?
Q2. After this work, where can I check the results? Can I check them on "containerlogs web UI"?

IMPORTANT!! ADD. I re-ran the command. and check the status: SUCCEEDED. Why does the park-submit operation sometimes succeed and sometimes stop in the middle?

Is this an error that occurred after the spark operation?

0 Answers0