XML: how can I search in whole xml and find nodes by id and delete them?

Question

A simple example of a xml file

<?xml version="1.0" encoding="UTF-8"?>

<bookstore>
    <speklap name="gj">
    <book>
      <title lang="en" id="1">Harry Potter</title>
      <price>29.99</price>
    </book>
    <book>
        <title lang="en" id="2">Learning XML</title>
        <price>39.95</price>
      </book>
    <photostore>
        <photo>
             <title lang="en" id="3">Learning XPATH</title>
             <price>1.000</price>
           </photo>
       </photostore>
    </speklap>
 </bookstore>

What I want to achieve is to search for a node with attributes id =2 and id=3 and remove the only this 2 nodes. The problem is that I can found enough examples by targeting the node but not how to search the whole xml and find a node based on a id and remove only the node with this id.

So the desired output is:

<bookstore>
    <speklap name="gj">
    <book>
      <title lang="en" id="1">Harry Potter</title>
      <price>29.99</price>
    </book>
    <book>
        <price>39.95</price>
      </book>
    <photostore>
        <photo>
             <price>1.000</price>
           </photo>
       </photostore>
    </speklap>
 </bookstore>

It would be great to make a simple script but I'm a beginner. I tried XQuery. But im also interested in a bash script. Hope somebody can help me in the good direction

This might help: [XMLStarlet delete parent node](https://stackoverflow.com/q/66697357/3776858) — Cyrus, Mar 20 '21 at 08:07
Thx Cyrus I managed to fix it with xml starlet but only for one id: Do you know how to add the second id? xmlstarlet ed -d "//*[@id='1']" test.xml — GJF, Mar 20 '21 at 08:34
`xmlstarlet ed -d "//*[@id='1']" -d "//*[@id='2']" test.xml`? — Cyrus, Mar 20 '21 at 08:43
Thnx Cyrus I just found also this working example: xmlstarlet ed -d "//*[@id='1'or @id='2']" test.xml — GJF, Mar 20 '21 at 08:44
With XSLT this is extremely easy. Are you interested in an XSLT solution? — Dimitre Novatchev, Mar 21 '21 at 03:36

score 2 · Answer 1 · answered Mar 20 '21 at 08:46

2

xmlstarlet ed -d "//*[@id='1'or @id='2']" test.xml

answered Mar 20 '21 at 08:46

GJF

77
7

2

This would also be possible if you need a range. `xmlstarlet ed -d "//*[@id>'0' and @id<'3']" test.xml` – Cyrus Mar 20 '21 at 08:51
Great answer. If you added a little explanation I would upvote it. – dawg Apr 28 '21 at 12:38

score 1 · Accepted Answer · answered Mar 20 '21 at 13:36

With BaseX, the following command call can be used to delete nodes in a document:

basex -u -i test.xml "delete node //*[@id = (2, 3)]"

With -u, updates will be propagated back to the original file. With -i, the input document is specified. The subsequent string is a valid XQuery expression with the requested update.

One alternative is to directly specify the input document in the query (and I have slightly modified the predicate; it’s equivalent to the first version):

basex -u "delete node doc('test.xml')//*[@id = 2 or @id = 3]"

XML: how can I search in whole xml and find nodes by id and delete them?

2 Answers2