Basically, you go over the string with the wildcards (I'll call that string pattern from now on):
If the first character of the pattern is a character but ?
, then try to consume exactly that character from the input string (the other string without wildcards).
If the first character of the pattern is a ?
, then you have two cases:
- The
?
should match a character (in order to match the complete pattern), so just consume the next character from the input and go on.
- The
?
should not match a character, in which case you'd go on with the next character from the pattern and leave the input string unchanged.
Of course you cannot know before which of these cases to chose when. So you need to be able to go back to that point in case your guess was wrong.
For that, you can use recursion (or more precisely: the new context, in terms of local variables and such, that you get from a recursive call):
You just call your matching function first with the remaining pattern and the input string, and if that fails you call your matching function with the remaining pattern and the input string without its first character (thus making the ?
consume a character).
Example:
pattern: s?y
input: say
First character of the pattern is a s
, this is the normal non-wildcard matching, so looking at the first character of the input this matches, moving both on:
pattern: ?y
input: ay
Now there's a wildcard to match, so pretend that it didn't consume any character and let's see where that gets us. Call the matching function with:
pattern: y
input: ay
Ouch, that doesn't match (a != y
), so return false
at this point. This brings us back to where we called the matching function (in the step above), leaving us with:
pattern: ?y
input: ay
We already tried to match the wildcard as no character, now try to match it as any character, thus consume the a
:
pattern: y
input: y
Wow, matches, and both strings are empty on the next run, so we have a match!
This seems to be homework, which you probably have to implement in C++. I won't give you that code. Rather, I'll give you an implementation in a different language - Clojure - which should allow you to further understand the above algorithm.
(ns wildcards
(:refer-clojure))
(defn- dpr
"Debug printing with poor man's indendation"
[pattern & rest]
(print (repeat (- 6 (count pattern)) " "))
(apply println rest))
(defn wildcard-match [input pattern]
(println "wildcard-match " input pattern)
(if (or (empty? input) (empty? pattern))
;; One is empty, return true if both are
(and (empty? input) (empty? pattern))
;; Else
(if (= (first pattern) \?)
;; Wildcard, so with short ciruiting or:
(or (do
(dpr pattern "Try to match no character...")
(wildcard-match input (rest pattern)))
(do
(dpr pattern "Ok, so try to match any character...")
(recur (rest input) (rest pattern))))
;; Non-Wildcard, test for equality, and if equal, go on.
(and (= (first pattern) (first input))
(recur (rest input) (rest pattern))))))
(defn testcase [input pattern]
(println "#####################################")
(println "Trying to match" input "with" pattern)
(println "=>" (wildcard-match (seq input) (seq pattern)))
(println))
(doall (map #(testcase (first %) (second %))
[["hello" "hello"]
["hello" "h?l?o"]
["hllo" "h?l?o"]
["hlo" "h??lo"]
["hello" "h?lo"]
["hello" "h???p"]]))
You can see this being executed here:
http://ideone.com/8o4QdR
Since Clojure is a functional language with strong usage of recursion you see a lot of recursion going on there. Translating that to an more imperative language like C++ should get rid of most of these recursions, in particular those which can be replaced by loops (that's all recur
calls, leaving only one necessary use of recursion).