0

I am trying to do a keyword search in a sentence with swift.

for example given

keywords = ["black", "bag", "love", "filled"]

Sentence1 = "There is a black bag in a house filled with love"

Sentence2 = "We are in a shop. There is a black bag on the counter"

Sentence3 = " The ocean is beautiful and lovely today"

I want to search each sentence for the all the keywords and return sentences that contains all the keywords and the ones that does not. So output should

Sentence1 : 4 keywords Sentence2 : 3 keywords Sentence3 : none

this is my attempt to solve this

 var RawSentences = ["There is a black bag in a house filled with love", "We are in a shop. There is a black bag on the counter", " The ocean is beautiful and lovely today"]

 var keywords = ["black", "bag", "love", "filled"]

 for item in RawSentences {
        var matchkeywords: [String] = []
        
        for kword in keywords{
            
            if item.range(of:kword) != nil {
                print("Yes!!!! \(kword) is in \(generatedString)")
                matchkeywords.append(kword)
            }
        }
         
        print("There are \(String(matchkeywords.count)) keyword in \(item)")
        
       
    }

What is the best way to implement this in swift?

e.iluf
  • 1,389
  • 5
  • 27
  • 69

1 Answers1

1

If you only want to match the whole words you will need to use a regular expression and add boundaries to your keywords. You can also make your search case and diacritic insensitive:

let sentences = ["There is a black bag in a house filled with love",
                 "We are in a shop. There is a black bag on the counter",
                 "The ocean is beautiful and lovely today"]
let keywords = ["black", "bag", "love", "filled"]

var results: [String: [String]] = [:]
for sentence in sentences {
    for keyword in keywords {
        let escapedPattern = NSRegularExpression.escapedPattern(for: keyword)
        let pattern = "\\b\(escapedPattern)\\b"
        if sentence.range(of: pattern, options: [.regularExpression, .caseInsensitive, .diacriticInsensitive]) != nil {
            results[sentence, default: []].append(keyword)
        }
    }
}

print(results)  // ["There is a black bag in a house filled with love": ["black", "bag", "love", "filled"], "We are in a shop. There is a black bag on the counter": ["black", "bag"]]

If you would like to know the location of the keywords in your sentences all you need is to append the ranges found instead of the keywords:

var results: [String:[Range<String.Index>]] = [:]
for sentence in sentences {
   for keyword in keywords {
       let escapedPattern = NSRegularExpression.escapedPattern(for: keyword)
       let pattern = "\\b\(escapedPattern)\\b"
       if let range = sentence.range(of: pattern, options: [.regularExpression, .caseInsensitive, .diacriticInsensitive]) {
           results[sentence, default: []].append(range)
       }
   }
}

print(results)  // ["We are in a shop. There is a black bag on the counter": [Range(Swift.String.Index(_rawBits: 1900544)..<Swift.String.Index(_rawBits: 2228224)), Range(Swift.String.Index(_rawBits: 2293760)..<Swift.String.Index(_rawBits: 2490368))], "There is a black bag in a house filled with love": [Range(Swift.String.Index(_rawBits: 720896)..<Swift.String.Index(_rawBits: 1048576)), Range(Swift.String.Index(_rawBits: 1114112)..<Swift.String.Index(_rawBits: 1310720)), Range(Swift.String.Index(_rawBits: 2883584)..<Swift.String.Index(_rawBits: 3145728)), Range(Swift.String.Index(_rawBits: 2097152)..<Swift.String.Index(_rawBits: 2490368))]]
Leo Dabus
  • 229,809
  • 59
  • 489
  • 571
  • this is a great solution! could you please help explain your code especially what you are doing with NSRegularExpression and pattern – e.iluf Aug 17 '20 at 01:22
  • @learner101 This is not necessary if the keyword doesn't have special characters used in the regex **"Returns a string by adding backslash escapes as necessary to protect any characters that would match as pattern metacharacters."** – Leo Dabus Aug 17 '20 at 01:24