What is the best tool for extracting text and inline tags (bold, italic, and so on) from a 2010 docx if the objective is to be able to transform the open XML into a less complex one?
An idea that comes to mind is to convert the docx to another format. If so, which format would you suggest and on which program (preferably open source)?
Any other ideas (that is, different approaches)? Many tools seem to still be done for MSOffice 2007. Is namely Xpath, XQuery and XSLT the way to go, and if so why?
Please be patient. I'm a beginner on this and I would also gladly welcome indications about preferably concise sources of knowledge.
xlixol