-1

From the following word "tacacatac", I want to match "cat". It seems like the regex c.*?t should give me this, but I guess it starts with the first occurrence of "c" and then from there finds the next "t", and thus, matches "cacat".

Is there a way to (perhaps using negative lookahead) start looking from the c just before the final t?

-----edit----- I need an option that will work if you replace the letters with strings

Thanks.

risraelsen
  • 253
  • 1
  • 2
  • 5
  • Why not `cat` directly? – Oleg Aug 12 '13 at 23:07
  • because any letter can appear between the c and t..for example, it could have been cit or cot. This is a simplification of what I need to do. I actually need to capture words between two keywords which may appear a number of times in a document. – risraelsen Aug 12 '13 at 23:17

2 Answers2

2

You can use negated character class:

c[^ct]*t

This will match any character but c and t in between.

Rohit Jain
  • 209,639
  • 45
  • 409
  • 525
  • Thanks. That will work for my example, but in reality, instaed of letters, I have words. For example: "the cat in the cat in the hat" and I want to match the final "cat in the hat". I guess this can't be done with a negated character class. – risraelsen Aug 12 '13 at 23:08
  • Why not just reverse your data and then match the first occurrence? – hwnd Aug 12 '13 at 23:18
  • Hmm..I guess that's an option that would probably work. If there is not a straighforward way of doing it, I'll just use the reverse function twice! Thanks for the suggestion. – risraelsen Aug 12 '13 at 23:23
  • If you reverse, you can match from beginning of the string `/^cat/`, or just use `/cat$/` or `/[Cc]at$/` to match cat at the end of the string. – hwnd Aug 12 '13 at 23:27
0

try this:

my $str = "the cat in the cat in the hat";
if ($str =~ /(cat(?:(?!cat).)*hat)/) {
    print $1, "\n";
}
newestbie
  • 16
  • 1
  • Great. It looks like that will do the trick. I'm sure I can implement it, but not sure I completely understand why it works. Can you (or somebody) explain exactly how the syntax (?:(?!cat).)* works? – risraelsen Aug 13 '13 at 03:05