19

I am looking for the best way to check if a string contains a substring from a list of keywords.

For example, I create a list like this:

List<String> keywords = new ArrayList<>();
keywords.add("mary");
keywords.add("lamb");

String s1 = "mary is a good girl";
String s2 = "she likes travelling";

String s1 has "mary" from the keywords, but string s2 does not have it. So, I would like to define a method:

boolean containsAKeyword(String str, List<String> keywords)

Where containsAKeyword(s1, keywords) would return true but containsAKeyword(s2, keywords) would return false. I can return true even if there is a single substring match.

I know I can just iterate over the keywords list and call str.contains() on each item in the list, but I was wondering if there is a better way to iterate over the complete list (avoid O(n) complexity) or if Java provides any built-in methods for this.

AdamMc331
  • 16,492
  • 10
  • 71
  • 133
swap310
  • 768
  • 2
  • 8
  • 22
  • You can find the methods you can invoke on a string here: https://docs.oracle.com/javase/7/docs/api/java/lang/String.html I found it very helpful to look through all the standard capabilities defined by the methods in the class `String`. – Joop Nov 24 '14 at 17:36
  • One important remark: imagine `keywords.add("travel")`, is the result of your function for the second phrase to be `true` (the fact that it's just a part of a word is enough) or `false` (only complete words are to be verified). – Dominique Nov 13 '18 at 14:48

7 Answers7

14

I would recommend iterating over the entire list. Thankfully, you can use an enhanced for loop:

for(String listItem : myArrayList){
   if(myString.contains(listItem)){
      // do something.
   }
}

EDIT To the best of my knowledge, you have to iterate the list somehow. Think about it, how will you know which elements are contained in the list without going through it?

EDIT 2

The only way I can see the iteration running quickly is to do the above. The way this is designed, it will break early once you've found a match, without searching any further. You can put your return false statement at the end of looping, because if you have checked the entire list without finding a match, clearly there is none. Here is some more detailed code:

public boolean containsAKeyword(String myString, List<String> keywords){
   for(String keyword : keywords){
      if(myString.contains(keyword)){
         return true;
      }
   }
   return false; // Never found match.
}

EDIT 3

If you're using Kotlin, you can do this with the any method:

val containsKeyword = myArrayList.any { it.contains("keyword") }
AdamMc331
  • 16,492
  • 10
  • 71
  • 133
  • Just out of curiosity, why do you add edits to your post? I mean last 10 minutes is just a small time if you compare it to the future of this post. In the future people will likely find the added benefits of edits very small. Just wondering. – Joop Nov 24 '14 at 17:40
  • 1
    You're right, and sometimes I question myself too. However, I'm adding (what I believe) to be useful and relevant information that is more helpful than what was originally there. Putting the bold **EDIT** blocks is probably just out of habit. I mean, I *am* making an edit, right? – AdamMc331 Nov 24 '14 at 17:43
  • 1
    @Joop Not that you asked this part, but I felt edit 2 was important because it addresses even more OP's question about the complexity of the solution. While this is still O(n), I wanted to address a way that could *potentially* shorten the iteration. – AdamMc331 Nov 24 '14 at 17:44
  • 1
    It is indeed relevant information you added. And this way you do increase the likelihood that people who already read your post will read the edited parts also. And especially since posts generate a lot of hits in the first few minutes and it decreases a lot over a short amount of time. I was debating with myself if I should or should not do it. That's why I asked. – Joop Nov 24 '14 at 17:53
  • 1
    @Joop anytime you think you can improve your question/answer do it. Why should you willingly stand by something you know can be better? It's a good work ethic to have in everything you do. No one does things great the first time, and thankfully here we have the options to improve and/or correct our mistakes. – AdamMc331 Nov 24 '14 at 17:58
8

Now you can use Java 8 stream for this purpose:

keywords.stream().anyMatch(keyword -> str.contains(keyword));
Paul Roub
  • 36,322
  • 27
  • 84
  • 93
5

In JDK8 you can do this like:

public static boolean hasKey(String key) {
   return keywords.stream().filter(k -> key.contains(k)).collect(Collectors.toList()).size() > 0;
}

hasKey(s1); // prints TRUE
hasKey(s2); // prints FALSE
fdam
  • 820
  • 1
  • 11
  • 25
2

Iterate over the keyword list and return true if the string contains your keyword. Return false otherwise.

public boolean containsAKeyword(String str, List<String> keywords){
    for(String k : keywords){
        if(str.contains(k))
            return true;
    }

    return false;
}
Pier-Alexandre Bouchard
  • 5,135
  • 5
  • 37
  • 72
2

Here is the solution

List<String> keywords = new ArrayList<>();
keywords.add("mary");
keywords.add("lamb");

String s1 = "mary is a good girl";
String s2 = "she likes travelling";
// The function
boolean check(String str, List<String> keywords)
  Iterator<String> it = keywords.iterator();
  while(it.hasNext()){
    if(str.contains(it.next()))
       return true;
  }
  return false;
}
Junaid
  • 2,572
  • 6
  • 41
  • 77
0

You can add all the words in keywords in a hashmap. Then you can use str.contains for string 1 and string 2 to check if keywords are available.

gashu
  • 484
  • 6
  • 17
0

Depending on the size of the list, I would suggest using the matches() method of String. String.matches takes a regex argument that, with smaller lists, you could sinply build a regular expression and evaluate it:

String Str = new String("This is a test string");
System.out.println(Str.matches("(.*)test(.*)"));

This should print out "true."

Or you could use java.util.regex.Pattern.

AdamMc331
  • 16,492
  • 10
  • 71
  • 133
wisher
  • 1
  • 1