-1

I wrote a program that generates a md5 hash onto a printed out bill. I want to be able to check the hash against a generated list of hashes. I then use a Levenshtein distance function to figure out which hash has the lowest edit distance from the printed out bill.

Here is my code:

func checkIfBillIsLegit(stringToCheck:String) -> Bool {
  for i in 0...((secretWords.count)) {                                 // for loop runs about 5 times
     let hashs = String().generateAll(secretWords[i])                  // create the md5 hashs to check against, returns an array with 50 elements
     for j in 0...(hashs.count) {
        if (stringToCheck.minimumEditDistance(hashs[j]) < 5) {        // Levenshtein distance function
           print("legit")
           print(secretWords[i])
           return true
        }
     }
  }

  print("not legit")
  return false
}

I want to be able to run this method multiple times per second. It works now, but it's slightly too slow for what I want to do. Problem is, the generateAll() method is too slow to generate 50 hashes per second. I was thinking of calling generateAll outside of this method, but I can't figure out how I would be able to keep track of the list?

Any help would be appreciated.

generateAll() method:

mawnch
  • 385
  • 2
  • 4
  • 13
  • How often does the array of `secretWords` change? – Paulw11 Sep 19 '16 at 04:50
  • The array of secretWords changes outside of this view controller. We can assume that it never changes. – mawnch Sep 19 '16 at 04:52
  • So you can calculate your hashes once and store the array of hashes for each word in a dictionary [String:[Hash]]. You can use a lazy property so that the hashes are calculated the first time they are needed. Alternatively you could calculate the hashes on a dictionary miss, so if words are added to the array, the system will automatically calculate new hashes. You could also use `NSCache` rather than a dictionary – Paulw11 Sep 19 '16 at 04:55
  • Can you write out in code what you're explaining? I'm not 100% sure what you are meaning. – mawnch Sep 19 '16 at 05:00
  • 1
    Can you post the code for `generateAll()`? There may be something there too. – i_am_jorf Sep 19 '16 at 05:13

1 Answers1

0

You can use an NSCache to ensure that you only calculate the hashes for a particular word when required. Typically this will be the first time your function is called, but it could also be if the array of secret words is expanded:

var hashCache = NSCache()

func checkIfBillIsLegit(stringToCheck:String) -> Bool {

    for secretWord in secretWords {                                 
        var hashes = hashCache.objectForKey(secretWord) as? [Hash]
        if hashes == nil {
            hashes= String().generateAll(secretWord)
            hashCache.setObject(hashes, forKey: secretWord)
        }

        for hash in hashes! {
            if stringToCheck.minimumEditDistance(hash) < 5 {
                print("legit")
                print(secretWord)
                return true
            }
        }
    }
    print("not legit")
    return false
}

If you want to know which "secret word" was the match, then I would change the function to return String?:

var hashCache = NSCache()

func checkIfBillIsLegit(stringToCheck:String) -> String? {

    for secretWord in secretWords {                                 
        var hashes = hashCache.objectForKey(secretWord) as? [Hash]
        if hashes == nil {
            hashes= String().generateAll(secretWord)
            hashCache.setObject(hashes, forKey: secretWord)
        }

        for hash in hashes! {
            if stringToCheck.minimumEditDistance(hash) < 5 {
                print("legit")
                print(secretWord)
                return secretWord
            }
        }
    }
    print("not legit")
    return nil
}
Paulw11
  • 108,386
  • 14
  • 159
  • 186
  • Thanks! I still need the index for the hash that had the correct minimum edit distance. How do I get that? Is it hashes.indexOf(hash) ? – mawnch Sep 19 '16 at 05:25
  • You could probably just change that loop back to a counted loop. However do you want the index or the word? If you want the word then the function should probably return `String?` rather than `Bool` and then either return the secret word or `nil`. Also, it doesn't look like `generateAll` should be an extension to `String`. At least not implemented in that way `secretWord.generateAll()` would make more sense – Paulw11 Sep 19 '16 at 05:27
  • Out of interested, what is the purpose of the minimum edit distance? Is it to allow for possible keying or OCR errors in the entry of the hash? – Paulw11 Sep 19 '16 at 05:30
  • You're right. `generateAll` shouldn't be an extension to string but it's useful to me in other parts of my app. I want the index because I have another array that has identifiers for the hashes in secretWords. And the purpose of the min edit distance is for OCR errors, good guess! – mawnch Sep 19 '16 at 05:39