0

I have an xml file which is updated every few hours by an e-commerce platform. I would like to generate a separate xml file with xpath filter.

XPath code is one line.

Which language should I use to generate that xml? Where can I find any template to make it work?

Ok, so i got that xml file:

 <o id="17" url="url" price="15.00" avail="1" weight="0" stock="3" set="0" 
 basket="0">
 <cat><![CDATA[ category ]]></cat>
 <name><![CDATA[ name ]]></name>
 <imgs><main url="url"/></imgs>
 <desc><![CDATA[description]]></desc>
 <attrs><a name="text"><![CDATA[ Dev ]]></a>
 <a name="Code"><![CDATA[ ]]></a>
 <a name="EAN"><![CDATA[ EAN ]]></a>
 </attrs>

 <o id="18" url="url" price="15.00" avail="1" weight="0" stock="3" set="0" 
 basket="0">
 <cat><![CDATA[ category2 ]]></cat>
 <name><![CDATA[ name ]]></name>
 <imgs><main url="url"/></imgs>
 <desc><![CDATA[description]]></desc>
 <attrs><a name="text"><![CDATA[ Dev ]]></a>
 <a name="Code"><![CDATA[ ]]></a>
 <a name="EAN"><![CDATA[ EAN ]]></a>
 </attrs>

There is many categories of products, but i need in seperate xml only products from "category2" with whole structure of the product. I found this to make this manually:

//o[normalize-space(cat) = 'category2']

Platform updating file every few hours with new products. So i would like to have script that download this file, automatically generate new xml with only category2.

Fenixq
  • 3
  • 2
  • Your question is very imprecise so it will not encourage many answers. If I get your right, you are looking for a tool to extract portions from an xml file and transform it into some other xml. The common tool for this job is an xslt stylesheet run through an xslt processor. The libxml library is another widespread interface to read and write xml data and has bindings for virtually every programming environment I am aware of. Without more info about the setting within which you operate, there is no sensible way to advise you on the language/toolset. Btw: XPath is not a programming language. – collapsar Sep 25 '19 at 16:30
  • i edited the post with more info – Fenixq Sep 25 '19 at 18:19

1 Answers1

2

Simply use the special-purpose language, XSLT, designed to transform XML files to run needed XPath expression. Specifically call the identity transform to copy entire document as is and remove all other nodes that do not fit criteria with an empty template.

Nearly every modern programming language (Java, C#, Python, PHP, Perl, R, VB, even Bash and PowerShell) and some software (Excel, SAS, etc.) maintains support for XSLT 1.0 usually via an XML library or module. Additionally, dedicated processors can run your XSLT script including Saxon, Xalan, xsltproc. See tag page for more info.

XSLT (save as .xsl file, a special .xml file)

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="yes"/>
  <xsl:strip-space elements="*"/>

  <!-- IDENTITY TRANSFORM -->
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <!-- REMOVE NODES -->
  <xsl:template match="o[normalize-space(cat) != 'category2']"/>

</xsl:stylesheet>

Online Demo


A few XSLT processor examples:

Xalan command line (with Java compiler)

java org.apache.xalan.xslt.Process -IN source.xml -XSL script.xsl -OUT output.xml

Saxon command lines (with Java compiler)

java net.sf.saxon.Transform -s:source.xml -xsl:script.xsl -o:output.xml

java -jar dir/saxon9he.jar -s:source.xml -xsl:script.xsl -o:output.xml

xsltproc command line (for Unix machines)

xsltproc myScript.xsl Output.xml > myDesiredResult.xml

Powershell script (for Windows machines)

$xslt = New-Object System.Xml.Xsl.XslCompiledTransform;

$xslt.Load("C:\Path\To\Script.xsl");
$xslt.Transform("C:\Path\To\Input.xml", "C:\Path\To\Output.xml");
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • This is it! Thanks a lot! Is there option to send file to ftp server with PowerShell? I got Windows machine working 24/7. – Fenixq Sep 27 '19 at 08:46
  • Glad to help! That sounds like a new question but definitely search online for such a solution. – Parfait Sep 27 '19 at 13:49