4

I upload Scala / Spark jars to HDFS to test them on our cluster. After running, I frequently realize there are changes that need to be made. So I make the changes locally then push the new jar back up to HDFS. However, often (not always) when I do this, hadoop throws an error essentially saying that this jar is not the same as the old jar (duh).

I try clearing my Trash, .staging, and .sparkstaging directories but that doesn't do anything. I try renaming the jar, which will work sometimes and other times it won't (it's still ridiculous I have to do this in the first place).

Does anyone know why this is occurring and how I can prevent it from occurring? Thanks for any help. Here are some logs if that helps (edited out some paths):

Application application_1475165877428_124781 failed 2 times due to AM Container for appattempt_1475165877428_124781_000002 exited with exitCode: -1000 For more detailed output, check application tracking page:http://examplelogsite/ Then, click on links to logs of each attempt. Diagnostics: Resource MYJARPATH/EXAMPLE.jar changed on src filesystem (expected 1475433291946, was 1475433292850 java.io.IOException: Resource MYJARPATH/EXAMPLE.jar changed on src filesystem (expected 1475433291946, was 1475433292850 at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Failing this attempt. Failing the application.

gsamaras
  • 71,951
  • 46
  • 188
  • 305
David Schuler
  • 1,011
  • 2
  • 10
  • 21

2 Answers2

0

I haven't seen that exit code before, so to me, it doesn't say anything, I would suggest you to check the logs, like this:

yarn logs -applicationId <your_application_ID>
gsamaras
  • 71,951
  • 46
  • 188
  • 305
  • this is the weird thing. I'm running this via an Oozie workflow and neither the oozie job nor the spark job have any logs in typical place. I'm just getting the above log through Hue – David Schuler Oct 02 '16 at 18:56
0

According to your log, I'm sure it comes from yarn side.
You can modify yarn yourself to skip this exception as workaround.
I ran into this thread cause the error log changed on src filesystem, I met this issue and skipped it by modify yarn src code.
For more details, you can refer to how-to-fix-resource-changed-on-src-filesystem-issue

Eugene
  • 10,627
  • 5
  • 49
  • 67