0

I am working on this project that requires extracting data from all sorts of document, and for this I need to convert the document to XML so that I can later parse it using SAX Parser. So, how do I get the XML equivalent of a ms office document?

Nidhi jain
  • 123
  • 3
  • 14
  • 2
    If the document is saved with one of the newer formats (eg. `.docx` for Word) then it is already Zipped XML. – Richard Aug 10 '15 at 09:12
  • Which file formats are you working with? Something like `.doc` is quite different to something like `.xlsx`. Also, what kind of XML? HTML? Other? – Gagravarr Aug 10 '15 at 11:39

0 Answers0