Consider transforming your XML with XSLT, a declarative and special-purpose language like SQL, used specifically to transform XML documents. And since the mysql
CLI can run shell commands using system
or \!
, you can call an installed XSLT processor at command line or run a prepared (and compiled) general-purpose language (Java, Python, PHP, etc.) script at command line.
Assuming you need to transform original input to the following where distinct categories are split into different <Book>
nodes.
<?xml version="1.0" encoding="UTF-8"?>
<Library>
<Book>
<Author>Dave</Author>
<Title>XML Help</Title>
<Category>Computers</Category>
</Book>
<Book>
<Author>Dave</Author>
<Title>XML Help</Title>
<Category>XML</Category>
</Book>
</Library>
Run below XSLT. See Online Demo.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="no" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Book">
<xsl:apply-templates select="Category"/>
</xsl:template>
<xsl:template match="Category">
<Book>
<xsl:apply-templates select="preceding-sibling::Author"/>
<xsl:apply-templates select="preceding-sibling::Title"/>
<xsl:copy>
<xsl:apply-templates select="node()"/>
</xsl:copy>
</Book>
</xsl:template>
<xsl:template match="text()">
<xsl:apply-templates select="normalize-space()"/>
</xsl:template>
</xsl:stylesheet>
Then, call shell commands from the mysql
CLI:
With Unix's xsltproc
mysql> system xsltproc myScript.xsl Input.xml > Output.xml
With Windows' System.Xml.Xsl (see Powershell script here)
mysql> system Powershell.exe -File "PS_Script.ps1" "Input.xml" "myScript.xsl" "Output.xml"
Call general-purpose languages (see XSLT scripts here):
mysql> system java run_xslt
mysql> system python run_xslt.py
mysql> system php run_xslt.php
mysql> system perl run_xslt.pl
mysql> system Rscript run_xslt.R
Finally, run LOAD XML
using transformed document:
LOAD XML LOCAL INFILE 'myTranformedOutput.xml'
INTO TABLE mytable
ROWS IDENTIFIED BY '<Book>';