2

I have a large XML with the structure below. Now, I want to get rid of the <tuv xml:lang="en-GB"><seg>CONTENT</seg></tuv> nodes, so for each unit only the de-DE part stays (<tuv xml:lang="de-DE"><seg>CONTENT</seg></tuv>). Is there a way to do this with Notepad++ or a different tool? I am not really into coding, so the simpler the better.

What I have:

<tu tuid="ID_0">
<tuv xml:lang="en-GB">
<seg>Hello!</seg>
</tuv>
<tuv xml:lang="de-DE">
<seg>Hallo!</seg>
</tuv>
</tu>
<tu tuid="ID_1">
<tuv xml:lang="en-GB">
<seg>This is a test content! :)</seg>
</tuv>
<tuv xml:lang="de-DE">
<seg>Das ist ein Testinhalt! :)</seg>
</tuv>
</tu>
<tu tuid="ID_2">
<tuv xml:lang="en-GB">
<seg>All your base are belong tu us ...</seg>
</tuv>
<tuv xml:lang="de-DE">
<seg>Och nö, echt jetzt?</seg>
</tuv>
</tu>

What I want:

<tu tuid="ID_0">
<tuv xml:lang="de-DE">
<seg>Hallo!</seg>
</tuv>
</tu>
<tu tuid="ID_1">
<tuv xml:lang="de-DE">
<seg>Das ist ein Testinhalt! :)</seg>
</tuv>
</tu>
<tu tuid="ID_2">
<tuv xml:lang="de-DE">
<seg>Och nö, echt jetzt?</seg>
</tuv>
</tu>
Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
  • This has nothing to do with programming, AFAIK... would be best to move this question to `superuser.com` to get your answers. – code4life Aug 22 '12 at 14:42
  • I received a solution, if anyone is ever looking for this as well: Ctrl+H (Replace...) Find what: .*? Search mode: Regular expression checked: . matches newline – Robert Herzog Aug 23 '12 at 08:20

1 Answers1

1

This can be accomplished by Notepad++ regex find and search.
Hit Ctrl+H to open Find/Replace dialog box

  • Find What : <tuv xml:lang="en-GB">\r\n.*\r\n.*\r\n
  • Replace With : (Leave It Blank)
  • Search Mode: Regular expression
  • Click Replace All
Suresh Anbarasan
  • 943
  • 1
  • 8
  • 20