0

I've been using Boost::regex and Boost::regex_search and found that when I run the regex

\\<(\\w+\\-?\\w+)\\>

These all get matched like normal

BitcoinicaHacker> Who wants free bitcoins courtesy of bitcoinica?
<grepix> who doesn't!
<BitcoinicaHacker> post your btc addr
<nanotube> i think bitcoinica wants free bitcoins courtesy of bitcoinica

But lines like this also get matched

--> peacekeep3r (~peacekeep@chello084114169104.2.15.vie.surfer.at) has joined #bitcoin
<-- Raccoon has quit (Changing host)
--> Raccoon (bismuth@unaffiliated/raccoon) has joined #bitcoin

This is rather confusing since I specifically asked it to find a left angle bracket then text that might have a dash and then a right angle bracket.

Update 2:

Thanks to Ωmega for helping me find the best solution: <(\\w+(?:\\-\\w+)*)>

Update:

Either

<(\\w+\\-?\\w+)> or <([^-<>]+[^<>]*)> works for my purposes.

I forgot to remove the escape slashes.

ildjarn
  • 62,044
  • 9
  • 127
  • 211
trippedoutfish
  • 357
  • 1
  • 4
  • 8
  • 1
    Could this be related to not using the starting and ending anchors? (^ and $) – BlackVegetable Jun 28 '12 at 16:52
  • I just tried with that and it no longer matches anything. I tried also just the ^ at the beginning, but that didn't work either. AFAIK those have to do with beginning of the input string and ending so if I put those on it would be the same as using regex_match and I would have to supply everything else that might be in the line also. – trippedoutfish Jun 28 '12 at 17:05

1 Answers1

0

Try to use regex <([^-<>]+[^<>]*)> which reads:

Match content between < and > that starts with character other than -, <, or >, followed by any combination (also empty) of characters other than < or >.

Update:

You may also consider to use regex <((?!\\-\\-)[^<>]+)> which reads:

Match content between < and > that does not start with -- and does not contain any < or >.

Ωmega
  • 42,614
  • 34
  • 134
  • 203