2

I know that standards define two versions of ODT file: - one is a archive of different files, i.e. meta.xml, content.xml etc, - second is one big XML file with all the data. (I know above from http://en.wikipedia.org/wiki/OpenDocument_technical_specification#Document_Representation)

The latter version is better for processing, but unfortunately is not produced by OpenOffice.

The question is: Do you know any filter, converter, or anything what would help me transform ODT file in archive version into single XML file? The best would be a Java class.

Francis Upton IV
  • 19,322
  • 3
  • 53
  • 57
WojtusJ
  • 1,318
  • 1
  • 12
  • 19

2 Answers2

6

Both Open Office and Libre Office does can produce ODT files in the "one big XML" format. They are called "Flat ODT" files.

Open an ODT file and use "Save as…". From there you can change the file format to "Flat ODT".

gioele
  • 9,748
  • 5
  • 55
  • 80
  • You can then open this file with a text editor, or just use cat, and you will see the formatted xml source. – Jason S Jan 10 '14 at 05:54
  • Are you sure there's a way to do that with LibreOffice ? I don't see "Flat ODT" anywhere. – Stéphane Laurent Apr 17 '14 at 08:34
  • @StéphaneLaurent: yes LibreOffice supports Flat ODT files. For example the [release notes for 3.4](https://wiki.documentfoundation.org/ReleaseNotes/3.4) say «Re-write flat ODF import and export file filters from Java to C++ giving a huge speed increase.» – gioele Apr 18 '14 at 07:37
  • Thank you @gioele, but do you know how to save a file in this format ? – Stéphane Laurent Apr 18 '14 at 08:39
  • Just like any other file format: Save -> select "OpenDocument Text (Flat XML) (.fodt)" as its format in the bottom-right corner. – gioele Apr 18 '14 at 09:55
0

I solved the case by producing XSLT stylesheet that transforms ODT source files into one XML file "more or less" compatible with the standard. Below is the code.

<?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet version="1.0"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0">

        <xsl:param name="meta.file" select="'meta.xml'" /> 

        <xsl:template match="@*|node()">
            <xsl:copy>
                <xsl:apply-templates select="@*|node()" />
            </xsl:copy>
        </xsl:template>

        <xsl:template match="office:document-content">
            <office:document>
                <xsl:copy-of select="@*" />
                <xsl:variable name="meta" select="document($meta.file)/office:document-meta/office:meta" />
                <xsl:copy-of select="$meta" />
                <xsl:apply-templates />
            </office:document>
        </xsl:template>

    </xsl:stylesheet>
WojtusJ
  • 1,318
  • 1
  • 12
  • 19