Application have the string
variable which contains xml
data.
I trying to remove all tags <product_desc></product_desc>
using Regex
.
Here are the value of the string
variable
<orderlines>
<orderline>
<id>1000001</id>
<product_id>2004</product_id>
<product_desc>ITEM2004
Color: red
Size: 150x10x10
Material: iron
</product_desc>
<qnt>2</qnt>
</orderline>
<orderline>
<id>1000002</id>
<product_id>2012</product_id>
<product_desc>ITEM2012</product_desc>
<qnt>4</qnt>
</orderline>
<orderline>
<id>1000003</id>
<product_id>3000</product_id>
<product_desc>DELIVERY</product_desc>
<qnt>1</qnt>
</orderline>
</orderlines>
When I using next pattern:
Dim pattern As String = "(<product_desc>[\s\S]*</product_desc>)"
Dim newvalue As String = Regex.Replace(originvalue, pattern, "")
I get result like this:
<orderlines>
<orderline>
<id>1000001</id>
<product_id>2004</product_id>
<qnt>1</qnt>
</orderline>
</orderlines>
So problem is that Regex
matches all values between first <product_desc>
and last </product_desc>
and replace them with empty string. This approach remove all <orederline>
tags between them(check value of the <qnt>
tag).
Can anybody give some tip of how limit removing to remove only specific tag. Content of the tag can contain all possible characters, newlines and even html code.