5

I have a requirement to convert OpenOffice Excel data into XML . I have Apache OpenOffice 4.1.1 in my machine (not MS- Office).

Sample Data. (First Row is of Tags )

CustData FirstName  MiddleName  LastName   EMail             PhoneNumber
           abe       x          Park      abe@mail.com       2323232323
           poppy     y          Kaith     Poppy@mail.com     2323232323

Need Result as :

<CustData>
        <FirstName>abe</FirstName>  
        <MiddleName>x</MiddleName>
        <LastName>Park</LastName>   
        <EMail>abe@mail.com</EMail>             
        <PhoneNumber>2323232323</PhoneNumber>
</CustData>
<CustData>
       <FirstName>poppy</FirstName>  
       <MiddleName>y</MiddleName>
       <LastName>Kaith</LastName>   
        <EMail>Poppy@mail.com </EMail>             
        <PhoneNumber>2323232323</PhoneNumber>
</CustData>
tohuwawohu
  • 13,268
  • 4
  • 42
  • 61
Rohit
  • 172
  • 1
  • 9
  • Both the OpenOffice and Microsoft Office formats already _are_ in an XML format (or zipped XML format). You should perhaps take as input the actual XML document saved by Calc, and use XSLT to transform it. – Mathias Müller Sep 28 '15 at 06:50
  • @MathiasMüller From where we get actual XML document saved by Calc , XSLT to transform and how we use it ? If a Example is given with openoffice its really a great help to me . – Rohit Sep 28 '15 at 06:57
  • Calc saves your document as a _file_ in your file system. If you open such a file in a plain text editor, you will immediately see that it is an XML document. Look at this plain file and find out where the interesting data is located. Target those elements with XSLT templates. Please note: I can only point you in the right direction, your question as such is too vague and broad for Stackoverflow. – Mathias Müller Sep 28 '15 at 14:53

3 Answers3

6

Openoffice and Libreoffice Calc is able transforming its XML via XSLT with Export Filters. To do so with your example data, do the following:

At first create the following XSL file and save it as SampleDataExportFilter.xsl:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0" xmlns:table="urn:oasis:names:tc:opendocument:xmlns:table:1.0" xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0" exclude-result-prefixes="office table text">

 <xsl:template match="/">
  <root>
   <xsl:apply-templates select="/*/office:body" />
  </root>
 </xsl:template>

 <xsl:template match="office:body">
  <xsl:apply-templates />
 </xsl:template>

 <xsl:template match="office:spreadsheet">
  <xsl:apply-templates />
 </xsl:template>

 <xsl:template match="office:spreadsheet/table:table">

   <xsl:for-each select="table:table-row[position() &gt; 1]">

   <CustData>
    <FirstName><xsl:value-of select="table:table-cell[2]/text:p" /></FirstName> 
    <MiddleName><xsl:value-of select="table:table-cell[3]/text:p" /></MiddleName>
    <LastName><xsl:value-of select="table:table-cell[4]/text:p" /></LastName>   
    <EMail><xsl:value-of select="table:table-cell[5]/text:p" /></EMail>            
    <PhoneNumber><xsl:value-of select="table:table-cell[6]/text:p" /></PhoneNumber>
   </CustData>

   </xsl:for-each>

 </xsl:template>
</xsl:stylesheet>

Now open Calc and select Tools - XML Filter Settings:

enter image description here

Select New and fill the dialog General:

enter image description here

In the register Transformation select the SampleDataExportFilter.xsl as XSLT for export:

enter image description here

Confirm with OK and the XML Filter Settings with Close.

Now create the following Calc file:

enter image description here

With File - Export you shold now be able to export the spreadsheet data using the File type CustData (.xml) as XML.

Axel Richter
  • 56,077
  • 6
  • 60
  • 87
  • Using OpenOffice 3.4 and when validating I see a perfectly fine XML file in front of me. But when I press `Validate` it gives me two errors: `Document is invalid: no grammar found.` and `Document root element "roor", must match DOCTYPE root "null".` If I try to use this to export it fails. Any idea what can be wrong? – Madmenyo Oct 09 '15 at 22:52
  • OpenOffice 3.4? Really? Current versions are 4.x. The errors mentioned from the oO XML validator are not relevant. A `Grammar` is a Document Type Declaration (`DTD`). The exported XML of course has none. Why it should have? The next error is a successor of this. Try the same with the oO own XML-Filter, `MS Excel 2003 XML` for example. Same "errors". Those errors can not lead the export to fail. – Axel Richter Oct 10 '15 at 07:56
  • Yeah really, I hate it when they change or remove X when I got used to that. Or when things stop working, happend countless of times to me with updating software. Anyway, I worked it out using this: https://forum.openoffice.org/en/forum/viewtopic.php?t=3490 it's somewhat different then your example but I got it to work the same way I tried it with your example. – Madmenyo Oct 10 '15 at 08:11
  • My example is extremely simplified and will only work with exactly the Calc table that is shown in the example: FirstName in column B, MiddleName in column C,.... It is provided to understand the principle not to be a ready to use solution. – Axel Richter Oct 10 '15 at 08:24
  • I just changed the first row naming and corresponding parts in the XSL file. And I think I should not even have to change the XSL since it is looking for a cell not values and the xml `` can be anything. It just is strange that it did show a XML file on validating but it would not save it. – Madmenyo Oct 10 '15 at 08:32
  • Works for me with OOCalc 4.11. How exactly the export is failing? Error message/s? Save XML with only empty ? Simply nothing saved? – Axel Richter Oct 10 '15 at 08:54
  • A simple error message when trying to export "could not be saved" or something. When validating the export XSL it shows those two errors in my first comment, but that is also the case for my current working solution. This is working: https://www.dropbox.com/s/zorkxxst6lkb47z/import.xsl?dl=0 Whatever I do to the sheet, this always works. It just gets B1:B6 for each row that has data. – Madmenyo Oct 10 '15 at 09:07
  • Sorry, can't reproduce. My XSL works also for me with OpenOffice Calc 3.2.1 which I had nor found in my Ubuntu. – Axel Richter Oct 10 '15 at 10:02
  • Thanks for looking into it. I figured there might be something wrong with your answer but I guess I overlooked something. I have a working solution so it does not really matter to me. – Madmenyo Oct 10 '15 at 11:44
1

You can download OpenXmlSDK Open XML SDK It contains Productivity Tools which could help you to discover structure of excel documents, for example: enter image description here

And then use any available XSLT tutorial to find out about approaches of conversion one XML structure to another

Tutorial 1

Java tutorial

And also this might help you:

https://github.com/foglcz/xsl-excel-engine

Rao
  • 20,781
  • 11
  • 57
  • 77
Roman Badiornyi
  • 1,509
  • 14
  • 28
0

Awesome tips! Found a bug in Calc 6.0.7.3, if two consecutive cells are the same value it skips the value of the second cell and uses value of the next cell! All subsequent values for that row are shifted by a column as well, and it's accumulative within the row. If I reorganize the columns so there are no same field values in side-by-side columns, and change the cell ref # in the .xsl to match, it works fine.