3

I read that you should use ? to match text non-greedily, so the regex

http://.*?\.png

...used on

http://example.png.png

...would return http://example.png.

But the non-greediness only seems to work from left to right. That is, if I matched it on

http://http://example.png

...it would return http://http://example.png.

How can I get the code to match http://example.png only?

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
Jessica
  • 2,335
  • 2
  • 23
  • 36

2 Answers2

2

Try this:

http://[A-Za-z0-9_-]+\.png

It wont get the first http:// because it has more than [A-Za-z0-9_-]+ between it and .png

Could also use this if you are worried about other characters in the URL:

http://[^:]+?\.png
Smern
  • 18,746
  • 21
  • 72
  • 90
  • Thanks! Do all URLs only consist of A-Z, a-z, 0-9, -, _, and %? – Jessica Aug 09 '13 at 21:51
  • I've never even seen a URL with `%` to be honest. You could replace the `[A-Za-z0-9_-]` portion with `[^:]+?` and it would probably work fine for any of your cases if you are worried about that. – Smern Aug 09 '13 at 21:52
  • Oh I just realized on issue with URLs with port numbers. like http://google.com:80/ which is a valid url, but I don't think is matched by this regex. – Tom Belote Aug 10 '13 at 19:06
  • True enough, I suppose the negative look ahead would work better if this is a concern. I'm sort of curious what the use case of this is. – Smern Aug 10 '13 at 20:36
2

You could use a negative look ahead too, but I think smerny 's answer is better.

http://(?!http://).*?\.png
Tom Belote
  • 590
  • 3
  • 8