How can I get text from Open Office document?
I use OO to convert ms word and excel files to PDF format in order to display in a web page. But in some cases, it is possible to face corrupted files, and corrupted files opens as very big xml.
To solve this problem I am going to get first row of content, and if xml tag exists suggest to user download document and try repair or open it in ms word. But I didn't find any detailed documentation and examples how to work with text.