0

Using Regular Expression, how do you match a specific word that is not in an <a> tag.

eg. I am looking for the word software that are not a link (ie. not surrounded by <a ... </a>)

Sample input

... <a href='#'>this software</a> ... software ... <a href='#'>software</a>.

Is it possible using regex to match only the second software?

If not possible, how do you check in C# if the matched element is inside an <a> tag?

Aximili
  • 28,626
  • 56
  • 157
  • 216

2 Answers2

4

Possible : Yes

Recommended : No

There are plenty of HTML parsers out there that might help

Here's a good read about the why it's not recommended: RegEx match open tags except XHTML self-contained tags . I couldn't put it better even if i'd try

Community
  • 1
  • 1
Noctis
  • 11,507
  • 3
  • 43
  • 82
0

I am not fully clear on the requirement. The following Regex should hopefully provide some base to what you might be looking for...

(?<=\</\w*\>).*
gpmurthy
  • 2,397
  • 19
  • 21