-1

I am working on a project where I have annotated images of certain leaves and saved them in xml format for identifying pests on the leaf using object detection. But since I am facing some ambiguity in some objects because some of the pests look similar but in actual sense they are different, I thought of removing one class. And since I have annotated all images, manually removing the labeling is a tedious task so I thought of writing a script to remove those objects in the xml file. The structure of the file is:

<annotation>
<folder>Set 3 A</folder>
<filename>IMG-20200904-WA0105.jpg</filename>
<path>C:\Users\Admin\Desktop\Set 3 A\Set 3 A\IMG-20200904-WA0105.jpg</path>
<source>
    <database>Unknown</database>
</source>
<size>
    <width>960</width>
    <height>1280</height>
    <depth>3</depth>
</size>
<segmented>0</segmented>
<object>
    <name>Whiteflies</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
        <xmin>232</xmin>
        <ymin>83</ymin>
        <xmax>286</xmax>
        <ymax>173</ymax>
    </bndbox>
</object>
<object>
    <name>Jassid Attack Effect</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
        <xmin>356</xmin>
        <ymin>7</ymin>
        <xmax>563</xmax>
        <ymax>359</ymax>
    </bndbox>
</object>
<object>
    <name>Jassid Attack Effect</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
        <xmin>356</xmin>
        <ymin>7</ymin>
        <xmax>563</xmax>
        <ymax>359</ymax>
    </bndbox>
</object>
<object>
    <name>Whiteflies</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
        <xmin>232</xmin>
        <ymin>83</ymin>
        <xmax>286</xmax>
        <ymax>173</ymax>
    </bndbox>
</object>

So if I want to remove the object name "Jassid Attack Effect" (it may be present multiple times in a document and all of them have to be removed as shown in the above xml code) and its contents, how will I do that? Like for eg: while parsing, object name is "Jassid Attack Effect", then I want to remove this entirely from the xml file:

<object>
    <name>Jassid Attack Effect</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
        <xmin>356</xmin>
        <ymin>7</ymin>
        <xmax>563</xmax>
        <ymax>359</ymax>
    </bndbox>
</object>
TjR
  • 7
  • 2

2 Answers2

0

Try something like this:

stuff = r"""your xml above""" #you need the "r" because you have unescaped backslashes; also note that the xml is not well-formed; you left out the closing <annotation> tag

from lxml import etree
doc = etree.XML(stuff)
target = doc.xpath('//object[name["Jassid Attack Effect"]]')[0]
target.getparent().remove(target)
print(etree.tostring(doc).decode())

Output:

<annotation>
<folder>Set 3 A</folder>
<filename>IMG-20200904-WA0105.jpg</filename>
<path>C:\Users\Admin\Desktop\Set 3 A\Set 3 A\IMG-20200904-WA0105.jpg</path>

<source><database>Unknown</database></source>
<size>
  <width>960</width>
    <height>1280</height>
    <depth>3</depth>
</size>
<segmented>0</segmented>
<object>
    <name>Whiteflies</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
        <xmin>232</xmin>
        <ymin>83</ymin>
        <xmax>286</xmax>
        <ymax>173</ymax>
    </bndbox>
</object>
</annotation>
Jack Fleeting
  • 24,385
  • 6
  • 23
  • 45
  • Thank You! Also some files have multiple times "Jassid Attack Effect" so how to remove all of them from the file and not just a single one? and how to save these changes made to the original file? – TjR Mar 08 '21 at 08:44
0
pip install pascal-voc
from pascal import annotation_from_xml
from pascal.utils import save_xml

if __name__ == "__main__":
    ann = annotation_from_xml("ann.xml")
    ann.filter_objects(["Jassid Attack Effect"])
    xml = ann.to_xml()
    save_xml("new_ann.xml", xml)
  • Please avoid code only answer and provide an explanation. You also could have a look at [how to answer](https://stackoverflow.com/help/how-to-answer). – doneforaiur Aug 26 '23 at 07:25
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Aug 26 '23 at 18:36