0

I have quite a few XML docs that I want to delete particular children in.

I've found some regular expressions similar to what I do, but it never quite worked for my specific case, deleting more things than needed.

Maybe someone could help me out with this? I'm using Notepad++

The goal here is to delete every < Item type="CEntityDef"> body that contains string < parentIndex value="-1" />

<?xml version="1.0" encoding="UTF-8"?>
<CMapData>
 <entities>
  <Item type="CEntityDef">
   <archetypeName>something</archetypeName>
   <parentIndex value="255" />
  </Item>
  <Item type="CEntityDef">
   <archetypeName>something</archetypeName>
   <parentIndex value="2334" />
  </Item>
  <Item type="CEntityDef">
   <archetypeName>something_2</archetypeName>
   <parentIndex value="-1" />
  </Item>
  <Item type="CEntityDef">
   <archetypeName>something_2</archetypeName>
   <parentIndex value="-1" />
  </Item>
 </entities>
</CMapData>

Desired outcome

<?xml version="1.0" encoding="UTF-8"?>
<CMapData>
 <entities>
  <Item type="CEntityDef">
   <archetypeName>something</archetypeName>
   <parentIndex value="255" />
  </Item>
  <Item type="CEntityDef">
   <archetypeName>something</archetypeName>
   <parentIndex value="2334" />
  </Item>
 </entities>
</CMapData>

Thank you for reading!

Radek
  • 7
  • 1
  • 3
    You should use an XML parser with your favorite scripting language. XML and regex are not good friends. – Toto Sep 28 '22 at 11:35
  • You might consider using cscript with MSXML or powershell to parse the xml, select elements with XPath and remove them. – William Walseth Sep 28 '22 at 12:33

1 Answers1

0

A very straightforward job for XSLT. In 3.0 it's

<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">
 <xsl:mode on-no-match="shallow-copy"/>
 <xsl:template match="Item[@type='CEntityDef']
                          [parentIndex/@value='-1']"/>
</xsl:transform>

Don't try to do this kind of thing with regular expressions.

If it's a one-off requirement you could use an interactive tool like xmlstarlet or Saxon's Gizmo. In Gizmo it's simply

delete //Item[@type='CEntityDef'][parentIndex/@value='-1']
Michael Kay
  • 156,231
  • 11
  • 92
  • 164