0

I have a number of sequenced Hadoop jobs in which I need a DistributedCache file.

The driver class(Controller) receives the input from the previous job, modifies a file, it places it in the DistributedCache and it starts a new job.

After the first job (i.e. in the second job), I get this error:

java.io.IOException: 
The distributed cache object hdfs://xxxx/xx/x/modelfile2#modelfile2 
changed during the job from 11/8/12 11:55 PM to 11/8/12 11:55 PM

Does anyone know what the problem might be ?

Razvan
  • 9,925
  • 6
  • 38
  • 51
  • the job still seems to complete successfully ! Is this a Hadoop bug ? Does it have anything to do with the space available on the HDFS ? Did any of you have the same problem ? – Razvan Nov 09 '12 at 18:07

1 Answers1

0

According to sources in TrackerDistributedCacheManager.java method downloadCacheObject when this exception happens it is not ignored and the actual download of file from HDFS to local file system is not happens. So task will not find its file in the distributed cache. I would suspect that you may be register the same object twice, or, there might be some bug in hadoop when several jobs put file with the same mail in the distributed cache from the same controller.

David Gruzman
  • 7,900
  • 1
  • 28
  • 30
  • I repeatedly do this: I modify the file locally, I copy the new version to HDFS, I register again the same file (but for a new job). You're saying I should not register it again ? I think I shoud because as far as I know, you have no guarantee that the file is still in the DistributedCache after the job which you registered for ended. – Razvan Nov 10 '12 at 22:08
  • For sure you should register it. But it sounds like a bug in hadoop. I would suggest the following way to test it. Set each time different file name (like File1, File2... and pass name of this file to the job via some config parameter. If this time it will work - it is clearly bug in Hadoop. – David Gruzman Nov 10 '12 at 22:50