3

I'm trying to:

- read a KML file
- remove the Placemark element if name = 'ZONE'
- write a new KML file without the element

This is my code:

from pykml import parser
kml_file_path = '../Source/Lombardia.kml'

removeList = list()

with open(kml_file_path) as f:
 folder = parser.parse(f).getroot().Document.Folder

for pm in folder.Placemark:
    if pm.name == 'ZONE':
        removeList.append(pm)
        print pm.name

for tag in removeList:
    parent = tag.getparent()
    parent.remove(tag)
#Write the new file
#I cannot reach the solution help me

and this is the KML:

<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://earth.google.com/kml/2.2">
<Document>
    <name>Lombardia</name>
    <Style>
    ...
    </Style>
    <Folder>
<Placemark>
            <name>ZOGNO</name>
            <styleUrl>#FEATURES_LABELS</styleUrl>
            <Point>
                <coordinates>9.680530595139061,45.7941656233647,0</coordinates>
            </Point>
        </Placemark>
        <Placemark>
            <name>ZONE</name>
            <styleUrl>#FEATURES_LABELS</styleUrl>
            <Point>
                <coordinates>10.1315885854064,45.7592449779275,0</coordinates>
            </Point>
        </Placemark>
    </Folder>
</Document>
</kml>

The problem is that when I write the new KML file this still has the element I want to delete. In fact, with I want to delete the element that contains name = ZONE. What i'm doing wrong? Thank you.

--- Final Code This is the working code thanks to @Dawid Ferenczy:

from lxml import etree
import pykml
from pykml import parser

kml_file_path = '../Source/Lombardia.kml'

# parse the input file into an object tree
with open(kml_file_path) as f:
  tree = parser.parse(f)

# get a reference to the "Document.Folder" node
folder = tree.getroot().Document.Folder

# iterate through all "Document.Folder.Placemark" nodes and find and remove all nodes
# which contain child node "name" with content "ZONE"
for pm in folder.Placemark:
    if pm.name == 'ZOGNO':
        parent = pm.getparent()
        parent.remove(pm)

# convert the object tree into a string and write it into an output file
with open('output.kml', 'w') as output:
    output.write(etree.tostring(folder, pretty_print=True))
CodeMonkey
  • 22,825
  • 4
  • 35
  • 75
xCloudx8
  • 681
  • 8
  • 21
  • 3
    That code doesn't make a sense at all. First, you parse the KML file using the XML library into a `tree` variable. Then you parse it again using the `pykml` library and you do some operations with the result. Finally, you just write the original untouched `tree` back into a file. I really have no clue what are your intentions here. – David Ferenczy Rogožan Aug 10 '18 at 15:30
  • And you import `etree` from the library `lxml` which is never used. Why are you parsing it twice using two different libraries? – David Ferenczy Rogožan Aug 10 '18 at 15:33
  • @DawidFerenczy Sorry i'm kinda tired today. I've posted the code without the test rows. But now i cannot figure out how to write again the file without the deleted element. – xCloudx8 Aug 10 '18 at 15:40
  • 1
    Does it work for you? I have tested it and it's working as expected. – David Ferenczy Rogožan Aug 10 '18 at 16:35
  • @DawidFerenczy Hi, i've tested right now but it does not delete the element "ZONE" i'm trying to figure it out. Thanks a lot – xCloudx8 Aug 13 '18 at 07:41
  • Added the working code i was printing the wrong variable =) – xCloudx8 Aug 13 '18 at 08:50

2 Answers2

2

Consider XSLT, the special purpose language designed to transform XML files. And because KML files are XML files, this solution is viable. Python's third-party module, lxml can run XSLT 1.0 scripts and do so without a single loop.

Specifically, the XSLT script runs the Identity Transform to copy entire document as is. Then, script runs an empty template on the element (conditional to specific logic) to remove that element. To accommodate the default namespace, a prefix, doc, is used for XPath search.

XSLT (save as .xsl file, a special .xml file to be loaded in Python below)

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                              xmlns:doc="http://earth.google.com/kml/2.2">
    <xsl:output method="xml" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@* | node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="doc:Placemark[doc:name='ZONE']"/>

</xsl:stylesheet>

XSLT Fiddle Demo

Python

import lxml.etree as et

# LOAD XML AND XSL
doc = et.parse('/path/to/Input.xml')
xsl = et.parse('/path/to/XSLT_Script.xsl')

# CONFIGURE TRANSFORMER
transform = et.XSLT(xsl)    

# RUN TRANSFORMATION
result = transform(doc)

# PRINT RESULT
print(result)  

# SAVE TO FILE
with open('output.xml', 'wb') as f:
   f.write(result)

Output

<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://earth.google.com/kml/2.2">
   <Document>
      <name>Lombardia</name>
      <Style>
    ...
    </Style>
      <Folder>
         <Placemark>
            <name>ZOGNO</name>
            <styleUrl>#FEATURES_LABELS</styleUrl>
            <Point>
               <coordinates>9.680530595139061,45.7941656233647,0</coordinates>
            </Point>
         </Placemark>
      </Folder>
   </Document>
</kml>
Parfait
  • 104,375
  • 17
  • 94
  • 125
1

You have the following issues in your code:

  • you're not storing the whole parsed object tree anywhere (you have just a reference to the node "Document.Folder": folder = parser.parse(f).getroot().Document.Folder) but you want to write it back into a file so you need to store it
  • I don't understand why you need two loops and the list removeList when you can delete elements directly in the first loop
  • you're not reading the documentation - it's well described how to write the object tree into a file under examples in pykml library's documentation

Try the following code:

from lxml import etree
from pykml import parser

kml_file_path = './input.kml'

# parse the input file into an object tree
with open(kml_file_path) as f:
  tree = parser.parse(f)

# get a reference to the "Document.Folder" node
folder = tree.getroot().Document.Folder

# iterate through all "Document.Folder.Placemark" nodes and find and remove all nodes 
# which contain child node "name" with content "ZONE"
for pm in folder.Placemark:
    if pm.name == 'ZONE':
        parent = pm.getparent()
        parent.remove(pm)

# convert the object tree into a string and write it into an output file
with open('output.kml', 'w') as output:
    output.write(etree.tostring(tree, pretty_print=True))

It's very simple:

  • KML file is parsed into an object tree and stored in variable tree
  • the same object tree is directly manipulated (removed element)
  • the same object tree is written back into a file
David Ferenczy Rogožan
  • 23,966
  • 9
  • 79
  • 68