0

I'm using TextLine in Cascading to load files with very large lines in Cascading. The lines are very long - around 30Mb on average, some much longer. When I run the job locally to test it it runs fine, but when I run it on the cluster it fails after a period of intensive crunching. It gives errors like:

cascading.tuple.TupleException: unable to read from input identifier: maprfs:/xxx/xxx/xxx/part-00001
at cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:127)
at cascading.flow.stream.SourceStage.map(SourceStage.java:76)
at cascading.flow.stream.SourceStage.run(SourceStage.java:58)
at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:127)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:443)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353)
at org.apache.hadoop.mapred.Child$4.run(Child.java:282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1122)
at org.apache.hadoop.mapred.Child.main(Child.java:271)

It also sometimes complains about stale file handles. The file it's trying to read is definitely there. Can somebody help me, please?

Savage Reader
  • 387
  • 1
  • 4
  • 16
  • Are you sure that is the complete stacktrace? Is it related to https://groups.google.com/forum/#!topic/cascading-user/TlKjFdnOa84 ? – Alfonso Nishikawa Aug 14 '14 at 19:16
  • Here is a stacktrace from one of the map jobs: http://pastebin.com/9JCbsmcr . I don't see how your link is related to my problem. My problem is with reading very long lines from a text file using TextLine, I don't use sequence files. – Savage Reader Aug 15 '14 at 10:52
  • You are right, not about sequence files. Anyway, the stacktrace gives some information: it seems it is related with MapR :( I found the offending line to be `if (this.curPos_ + length > this.inode_.eof()) {` but who knows (if I am right) why `inode_` is null :( – Alfonso Nishikawa Aug 16 '14 at 16:19
  • I've opened a case with MapR support, I hope it gets solved soon. – Savage Reader Aug 18 '14 at 13:10
  • Thanks for the report :) I'm curious about. – Alfonso Nishikawa Aug 19 '14 at 09:46

0 Answers0