local s = "http://example.com/image.jpg"
print(string.match(s, "/(.-)%.jpg"))
This gives me
--> /example.com/image
But I'd like to get
--> image
local s = "http://example.com/image.jpg"
print(string.match(s, "/(.-)%.jpg"))
This gives me
--> /example.com/image
But I'd like to get
--> image
If you're sure there is a /
in the string just before the filename, this works:
print(string.match(s, ".*/(.-)%.jpg"))
The greedy match .*/
will stop at the last /
, as desired.
Since the regex engine processes a string from left to right, your pattern found the first /
, then .-
matched any chars (.
) as few as possible (-
) up to the first literal .
(matched with %.
) followed with jpg
substring.
You need to use a negated character class [^/]
(to match any char but /
) rather than .
that matches any character:
local s = "http://example.com/image.jpg"
print(string.match(s, "/([^/]+)%.jpg"))
-- => image
See the online Lua demo
The [^/]
matches any chars but /
, thus, the last /
will be matched with the first /
in the pattern "/([^/]+)%.jpg"
. And it will match as
Removing the first /
from the pattern is not a good idea as it will make the engine use more redundant steps while trying to find a match, /
will "anchor" the quantified subpattern at the /
symbol. It is easier for the engine to find a /
than look for 0+ (undefined from the beginning) number of chars other than /
.
If you are sure this string appears at the end of the string, add $
at the end of the pattern (it is not clear actually if you need that, but might be best in the general case).
Why doesn't this match non-greedily and give me just the image name?
To answer the question directly: .-
doesn't guarantee the shortest match as the left part of the match is still anchored at the current position and if something is matched at that position, it will be returned as the result. Non-greedily just means that it will consume the least number of characters matched by its pattern as long as the rest of the pattern is matched. That's why using [^/]-
fixes the pattern as it will find the shortest number of characters that are not forward slashes and why using .*/.-
works as in this case .*
will greedily consume everything and then backtrack until the rest of the pattern is satisfied (which will produce the same result in this case).