0

I am new to Python and looking to modify an XML file to change some things around. I can provide an example followed by what I would like the output to be.

Original....

<programme channel="I9.11363.zap2it.com" start="20220729080000 -0500" stop="20220729090000 -0500">
    <title lang="en">Live with Kelly and Ryan</title>
    <sub-title lang="en">Live's Ready or Not Week; Live's Foodfluencer Friday Faceoff</sub-title>
    <desc lang="en">Making an emergency evacuation kit; a chef provides a summertime recipe.</desc>
    <date>20220729</date>
    <category lang="en">Talk</category>
    <category lang="en">Series</category>
    <length units="minutes">60</length>
    <icon src="https://zap2it.tmsimg.com/assets/p14101643_b_v13_ah.jpg" />
    <url>https://tvlistings.zap2it.com//overview.html?programSeriesId=SH02684484&amp;tmsId=EP026844841372</url>
    <episode-num system="common">S06E232</episode-num>
    <episode-num system="dd_progid">EP02684484.1372</episode-num>
    <episode-num system="xmltv_ns">5.231.</episode-num>
    <audio>
        <stereo>stereo</stereo>
    </audio>
    <new />
    <subtitles type="teletext" />
    <rating>
        <value>TV-PG</value>
    </rating>
</programme>

Desired Output.... Moving the "New" tag into the title and removing the <episode-num system="common">S06E232</episode-num> and placing it into the description.

<programme channel="I9.11363.zap2it.com" start="20220729080000 -0500" stop="20220729090000 -0500">
    <title lang="en">Live with Kelly and Ryan New</title>
    <sub-title lang="en">Live's Ready or Not Week; Live's Foodfluencer Friday Faceoff</sub-title>
    <desc lang="en">S06E232 (return)Making an emergency evacuation kit; a chef provides a summertime recipe. TV-PG 20220729 </desc>
    <icon src="https://zap2it.tmsimg.com/assets/p14101643_b_v13_ah.jpg" />
    <url>https://tvlistings.zap2it.com//overview.html?programSeriesId=SH02684484&amp;tmsId=EP026844841372</url>
</programme>
drew4
  • 1

1 Answers1

0

Here is an XSLT based solution.

Input XML

<?xml version="1.0"?>
<programme channel="I9.11363.zap2it.com" start="20220729080000 -0500" stop="20220729090000 -0500">
    <title lang="en">Live with Kelly and Ryan</title>
    <sub-title lang="en">Live's Ready or Not Week; Live's Foodfluencer Friday Faceoff</sub-title>
    <desc lang="en">Making an emergency evacuation kit; a chef provides a summertime recipe.</desc>
    <date>20220729</date>
    <category lang="en">Talk</category>
    <category lang="en">Series</category>
    <length units="minutes">60</length>
    <icon src="https://zap2it.tmsimg.com/assets/p14101643_b_v13_ah.jpg"/>
    <url>https://tvlistings.zap2it.com//overview.html?programSeriesId=SH02684484&amp;tmsId=EP026844841372</url>
    <episode-num system="common">S06E232</episode-num>
    <episode-num system="dd_progid">EP02684484.1372</episode-num>
    <episode-num system="xmltv_ns">5.231.</episode-num>
    <audio>
        <stereo>stereo</stereo>
    </audio>
    <new/>
    <subtitles type="teletext"/>
    <rating>
        <value>TV-PG</value>
    </rating>
</programme>

XSLT

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" encoding="utf-8" indent="yes" omit-xml-declaration="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="title">
        <xsl:copy>
            <xsl:attribute name="lang">en</xsl:attribute>
            <xsl:value-of select="concat(., ' new')"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="desc">
        <xsl:copy>
            <xsl:attribute name="lang">en</xsl:attribute>
            <xsl:value-of select="concat(/programme/episode-num[@system='common'], ' ', .)"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="date | category | length | episode-num | audio | new | subtitles | rating"/>
</xsl:stylesheet>

Output XML

<programme stop="20220729090000 -0500" channel="I9.11363.zap2it.com" start="20220729080000 -0500">
  <title lang="en">Live with Kelly and Ryan new</title>
  <sub-title lang="en">Live's Ready or Not Week; Live's Foodfluencer Friday Faceoff</sub-title>
  <desc lang="en">S06E232 Making an emergency evacuation kit; a chef provides a summertime recipe.</desc>
  <icon src="https://zap2it.tmsimg.com/assets/p14101643_b_v13_ah.jpg"/>
  <url>https://tvlistings.zap2it.com//overview.html?programSeriesId=SH02684484&amp;tmsId=EP026844841372</url>
</programme>

Python

import os
import lxml.etree as ET

inputfile = "D:\\temp\\input.xml"
xsltfile = "D:\\temp\\process.xslt"
outfile = "D:\\output\\output.xml"



dom = ET.parse(inputfile)
xslt = ET.parse(xsltfile)
transform = ET.XSLT(xslt)
newdom = transform(dom,
              id=XSLT.strparam("bk101"),
              author=XSLT.strparam("New Author"))
infile = unicode((ET.tostring(newdom, pretty_print=True)))
outfile = open(outfile, 'a')
outfile.write(infile)
Yitzhak Khabinsky
  • 18,471
  • 2
  • 15
  • 21