0

I trying to write a function in C++that can compare two strings s1 and s2 where only s2 has '?' characters. The ‘?’ character represents the ability to match any character, including the empty character. For example, colo?r matches both "color" and "colour". This query should report every word that matches. Other examples:

hello:hello__True

hello:h?l?o--true (both ? acts as wildcard)

hllo:h?l?o--true (first ? acts as empty, second ? acts as an wildcard)

hlo:h??lo--true (both ? act as empty)

hello: h?lo--false (? character can only replace one char, not a string)

hello:h???p--false( p does matches with any of the characters options possible)

I tried using lot many functions using loops but I am only able to handle problems where all '?' acts either as empty or as wildcard. When one acts as empty and other as wildcard then there are so many different strings to compare that things goes out of control.

My professor told that recursion is the key to solve this problem, but we haven't discussed much about recursion yet. Please help me with some kind of suggestions/ code which can use backtracking technique to solve this problem.

1 Answers1

0

Basically, you go over the string with the wildcards (I'll call that string pattern from now on):

If the first character of the pattern is a character but ?, then try to consume exactly that character from the input string (the other string without wildcards).

If the first character of the pattern is a ?, then you have two cases:

  1. The ? should match a character (in order to match the complete pattern), so just consume the next character from the input and go on.
  2. The ? should not match a character, in which case you'd go on with the next character from the pattern and leave the input string unchanged.

Of course you cannot know before which of these cases to chose when. So you need to be able to go back to that point in case your guess was wrong.

For that, you can use recursion (or more precisely: the new context, in terms of local variables and such, that you get from a recursive call):

You just call your matching function first with the remaining pattern and the input string, and if that fails you call your matching function with the remaining pattern and the input string without its first character (thus making the ? consume a character).

Example:

pattern: s?y
input:   say

First character of the pattern is a s, this is the normal non-wildcard matching, so looking at the first character of the input this matches, moving both on:

pattern: ?y
input:   ay

Now there's a wildcard to match, so pretend that it didn't consume any character and let's see where that gets us. Call the matching function with:

pattern: y
input:   ay

Ouch, that doesn't match (a != y), so return false at this point. This brings us back to where we called the matching function (in the step above), leaving us with:

pattern: ?y
input:   ay

We already tried to match the wildcard as no character, now try to match it as any character, thus consume the a:

pattern: y
input:   y

Wow, matches, and both strings are empty on the next run, so we have a match!


This seems to be homework, which you probably have to implement in C++. I won't give you that code. Rather, I'll give you an implementation in a different language - Clojure - which should allow you to further understand the above algorithm.

(ns wildcards
  (:refer-clojure))

(defn- dpr
  "Debug printing with poor man's indendation"
  [pattern & rest]
  (print (repeat (- 6 (count pattern)) " "))
  (apply println rest))

(defn wildcard-match [input pattern]
  (println "wildcard-match " input pattern)
  (if (or (empty? input) (empty? pattern))
    ;; One is empty, return true if both are
    (and (empty? input) (empty? pattern))
    ;; Else
    (if (= (first pattern) \?)
      ;; Wildcard, so with short ciruiting or:
      (or (do
            (dpr pattern "Try to match no character...")
            (wildcard-match input (rest pattern)))
          (do
            (dpr pattern "Ok, so try to match any character...")
            (recur (rest input) (rest pattern))))
      ;; Non-Wildcard, test for equality, and if equal, go on.
      (and (= (first pattern) (first input))
           (recur (rest input) (rest pattern))))))



(defn testcase [input pattern]
  (println "#####################################")
  (println "Trying to match" input "with" pattern)
  (println "=>" (wildcard-match (seq input) (seq pattern)))
  (println))


(doall (map #(testcase (first %) (second %))
  [["hello" "hello"]
   ["hello" "h?l?o"]
   ["hllo" "h?l?o"]
   ["hlo" "h??lo"]
   ["hello" "h?lo"]
   ["hello" "h???p"]]))

You can see this being executed here: http://ideone.com/8o4QdR

Since Clojure is a functional language with strong usage of recursion you see a lot of recursion going on there. Translating that to an more imperative language like C++ should get rid of most of these recursions, in particular those which can be replaced by loops (that's all recur calls, leaving only one necessary use of recursion).

Daniel Jour
  • 15,896
  • 2
  • 36
  • 63