2

I am getting this classcast exception when I am using MultipleInput in my MR job.

Error: java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit cannot be cast to org.apache.hadoop.mapreduce.lib.input.FileSplit
    at com.capitalone.integratekeys.mapreduce.mapper.IntegrationKeysMapperInput.setup(IntegrationKeysMapperInput.java:74)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
    at org.apache.hadoop.mapreduce.lib.input.DelegatingMapper.run(DelegatingMapper.java:55)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

inputSource = ((FileSplit)context.getInputSplit()).getPath().toString();

Although I found a ticket in JIRA about it saying its solved.But I am still facing this issue. Please give me some inputs.

Aviral Kumar
  • 814
  • 1
  • 15
  • 40
  • 1
    Appears to be a duplicate of: http://stackoverflow.com/questions/11130145/hadoop-multipleinputs-fails-with-classcastexception – Jeremy Beard Feb 27 '15 at 17:44
  • I have already seen that question. It was asked long time back. This is a bug actually and is acclaimed to be fixed in hadoop 0.20.1 version. But I am still facing this issue. So wanted to know whether someone is facing this issue. – Aviral Kumar Feb 28 '15 at 06:16
  • Which JIRA? I see https://issues.apache.org/jira/browse/MAPREDUCE-2226 is unresolved. Have you tried the accepted answer in the question linked above? – Jeremy Beard Mar 01 '15 at 15:19
  • hope this helps... https://issues.apache.org/jira/browse/MAPREDUCE-1178 – Aviral Kumar Mar 01 '15 at 18:13
  • I have tried that. But I don't wanna use reflection as its not good for optimization purposes – Aviral Kumar Mar 02 '15 at 06:49

1 Answers1

1

In this line

inputSource = ((FileSplit)context.getInputSplit()).getPath().toString();

context.getInputSplit() is instance of TaggedInputSplit

And convert to FileSplit.

I check both class have not parent child relation. So get exception. You can use Hadoop conf for get inputSource.

Way to get input source: JobContext context in argument

inputSource=context.getConfiguration().get("mapreduce.input.fileinputformat.inputdir", null);

If you not able to get input source, Please provide me how to set input file path in driver program

Tinku
  • 751
  • 5
  • 19