0

Hi I am trying to parse an XML file using map reduce framework. I am using JDOM Parser for parsing of the XML file. but when I run my map reduce code on a pseudo-node cluster than it gives me following error.

WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications  
                       should implement Tool for the same.
INFO input.FileInputFormat: Total input paths to process : 1
INFO util.NativeCodeLoader: Loaded the native-hadoop library
WARN snappy.LoadSnappy: Snappy native library not loaded
INFO mapred.JobClient: Running job: job_201303281220_0016
INFO mapred.JobClient: map 0% reduce 0%
INFO mapred.JobClient: Task Id : attempt_201303281220_0016_m_000000_0, Status : FAILED
Error: org/jdom/JDOMException
INFO mapred.JobClient: Task Id : attempt_201303281220_0016_m_000000_1, Status : FAILED
Error: org/jdom/JDOMException
INFO mapred.JobClient: Task Id : attempt_201303281220_0016_m_000000_2, Status : FAILED
Error: org/jdom/JDOMException
INFO mapred.JobClient: Job complete: job_201303281220_0016
INFO mapred.JobClient: Counters: 7
INFO mapred.JobClient: Job Counters
INFO mapred.JobClient: SLOTS_MILLIS_MAPS=7541
INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots  
                       (ms)=0
INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots 
                       (ms)=0
INFO mapred.JobClient: Launched map tasks=4
INFO mapred.JobClient: Data-local map tasks=4
INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
INFO mapred.JobClient: Failed map tasks=1

I tried downloading JDOM 1.x jars but still no help getting the same error. If someone can suggest something that will be a great help.

NOTE: I am able to run various examples like word-count,PI so I think my cluster is establish properly.

Thanks in advance.

user1188611
  • 945
  • 2
  • 14
  • 38
  • surely the JDOMException is logged somewhere! What does it say is the problem? – rolfl Apr 02 '13 at 12:05
  • @rolfl: I did checked the logs for the task and job tracker yesterday. I got the error and issue is resolved. Thanks for replying appreciate that. – user1188611 Apr 02 '13 at 13:41

1 Answers1

0

You need to confirm and ensure that your input file has one XML document per line (e.g. no line feeds in your XML)? It's probable that the map() method is being handed single lines (You're using FileInputFormat) but with embedded line feeds those line contain only partial XML documents.

For example, if your file looks like this:

<root
    arg1=""
    arg2="">

</root>

Then the map() method will be called once for each of the five lines. None of the lines contain a valid XML document. A DOM parsing error would be thrown 5 times, even though your file really does contain valid XML

Chris Gerken
  • 16,221
  • 6
  • 44
  • 59
  • I confirmed this that input file has one document per line. Can you suggested something else. Thanks for replying. – user1188611 Apr 01 '13 at 20:11
  • To be on same boat what exactly you mean xml document per line?? Just to be clear that we both are taking about same thing. – user1188611 Apr 01 '13 at 20:12