Textwrangler grep regex expression to remove every except one

Question

I am desperately looking to a Textwrangler grep syntax to remove all the multi language garbage from a cvs file.

I have a bunch of:

<span class="multilang" lang="en">Portugal</span><span class="multilang" lang="it">Portogallo</span><span class="multilang" lang="pt">Portugal</span><span class="multilang" lang="no">Portugal</span>

And I would like to remove the whole span tag and anything between

<span class="multilang" lang="en">

and the first occurrence of

</span>

and the text inside "Portugal" in this case. In this case it would remove

<span class="multilang" lang="en">Portugal</span>

Obviously "Portugal" is only an example, I have a lots of these kind of entries in the original csv.

I tried this:

</?span class="multilang" lang="en"*>(.*)(</span>).*\1

But it is not working at all. It cannot find anything.

Thank you in advance, Francesco

score 0 · Accepted Answer · answered Nov 17 '15 at 14:47

0

You may try this:

<span class="multilang" lang="en">([^<]*)<\/span>

LiveDemo

answered Nov 17 '15 at 14:47

Thomas Ayoub

29,063
15
95
142

Thank you so much. It worked flawlessly but in the meanwhile I found the other suggest solution (see below). – shaice Nov 17 '15 at 15:06

score 0 · Answer 2 · answered Nov 17 '15 at 14:51

0

Ok thank you I found the answer on the Textwrangler manual:

</?span class="multilang" lang="en"*>(.*?)</span>

Page 146: http://pine.barebones.com/manual/TextWrangler_User_Manual.pdf

answered Nov 17 '15 at 14:51

shaice

39
4

Textwrangler grep regex expression to remove every except one

2 Answers2