1

There is some info on the web indicating that Mahout's XMLInputFormat can be used to efficiently process XML on hadoop, but I've been unable to find an example of how to get this working. Can someone point me in the right direction?

I'm using Cascalog/Clojure.

Kevin
  • 24,871
  • 19
  • 102
  • 158

1 Answers1

0

Just have a look at this to read a xml file using hadoop implementation of record reader:

http://javatute.com/javatute/faces/post/hadoop/2014/reading-simple-xml-file-using-hadoop.xhtml

Ashish
  • 51
  • 5
  • Note that [link-only answers are discouraged](http://meta.stackoverflow.com/tags/link-only-answers/info), SO answers should be the end-point of a search for a solution (vs. yet another stopover of references, which tend to get stale over time). Please consider adding a stand-alone synopsis here, keeping the link as a reference. – kleopatra Mar 11 '14 at 09:27