1

I have a list of strings which contain 4 items:

Orange Lemon Pepper Tomato

Also, I have a String str which has a sentence:

Today, I ate a tomato and an orange.

1) How can I check that str has some keywords from list? without considering upper or lower case letters, basically capturing anything that matches?

I tried this but it doesn't work because it will look for the same words. list.Contains(str)

Also Dim result As String() = list.FindAll(str, Function(s) s.ToLower().Contains(str)) but also didn't work.

2) What if the word tomato was tomatoes in str, how can I still detect the tomato part and discard the es part?

Any suggestions or ideas ?

Sergey Berezovskiy
  • 232,247
  • 41
  • 429
  • 459
HShbib
  • 1,811
  • 3
  • 25
  • 47

6 Answers6

3
var list = new string[] { "Orange", "Lemon", "Pepper", "Tomato" };
var str = "Today, I ate a tomato and an orange.";

With LINQ and Regular Expressions you can check if string contains any keyword:

list.Any(keyword => Regex.IsMatch(str, Regex.Escape(keyword), RegexOptions.IgnoreCase));

Or get matched keywords:

var matched = list.Where(keyword =>
                Regex.IsMatch(str, Regex.Escape(keyword), RegexOptions.IgnoreCase));
// "Orange", "Tomato"

BTW this will match both tomatoes and footomato. If you need to match start of word, then search pattern should be changed a little: @"(^|\s)" + keyword

jessehouwing
  • 106,458
  • 22
  • 256
  • 341
Sergey Berezovskiy
  • 232,247
  • 41
  • 429
  • 459
  • Don't forget to call Regex.Escape on your keywords. Should you have a keyword with a special Regex character in it, you won't get a runtime exception... So Regex.IsMatch(str, Regex.Escape(keyword)....). – jessehouwing Dec 18 '12 at 00:36
3

If case sensitivity isn't an issue you could do this:

List<string> test = new List<string>();
test.Add("Lemon");
test.Add("Orange");
test.Add("Pepper");
test.Add("Tomato");

string str = "Today, I ate a tomato and an orange.";

foreach (string s in test)
{
      // Or use StringComparison.OrdinalIgnoreCase when cultures are of no issue.
      if (str.IndexOf(s, StringComparison.CurrentCultureIgnoreCase) > -1)
      {
          Console.WriteLine("Sentence contains word: " + s);
      }
}

Console.Read();
jessehouwing
  • 106,458
  • 22
  • 256
  • 341
DGibbs
  • 14,316
  • 7
  • 44
  • 83
2
Regex reg = new Regex("(Orange|lemon|pepper|Tomato)", RegexOptions.IgnoreCase | RegexOptions.Singleline);
MatchCollection mc = reg.Matches("Today, I ate tomatoes and an orange.");
foreach (Match mt in mc)
{
    Debug.WriteLine(mt.Groups[0].Value);
}
urlreader
  • 6,319
  • 7
  • 57
  • 91
1

With the list.Contains(str), you're checking if that list contains that whole string. What you need to do to check that str has words in the list is something like this:

foreach(var s in list)
{
     if(str.ToLower().Contains(s.ToLower()))
     {
          //do your code here
     }
}

That will iterate through your list, and check your str to see if it's in there. It will also solve your question 2. Since tomato is part of tomatoes, it will pass that check. The ToLower() portion makes everything lower case, and is commonly used when you want to ignore case.

PiousVenom
  • 6,888
  • 11
  • 47
  • 86
  • Don't use ToLower for string comparisons, use ToUpperInvariant or ToUpper(CultureInfo.CurrentCulture) instead. See: http://stackoverflow.com/questions/2801508/what-is-wrong-with-tolowerinvariant or use IndexOf(StringComparisonType)>-1 http://msdn.microsoft.com/en-us/library/ms224425(v=vs.95).aspx – jessehouwing Dec 18 '12 at 00:41
1
Private Function stringContainsOneOfMany(ByVal haystack As String, ByVal needles As String()) As Boolean
    For Each needle In needles
        If haystack.ToLower.Contains(needle.ToLower) Then
            Return True
        End If
    Next
    Return False
End Function

to use:

    Dim keywords As New List(Of String) From {
        "Orange", "Lemon", "Pepper", "Tomato"}
    Dim str As String = "Today, I ate a tomato and an orange"
    If stringContainsOneOfMany(str, keywords.ToArray) Then
        'do something
    End If
Steve
  • 20,703
  • 5
  • 41
  • 67
  • Don't use ToLower for string comparisons, use ToUpperInvariant or ToUpper(CultureInfo.CurrentCulture) instead. See: http://stackoverflow.com/questions/2801508/what-is-wrong-with-tolowerinvariant or use IndexOf(StringComparisonType)>-1 http://msdn.microsoft.com/en-us/library/ms224425(v=vs.95).aspx – jessehouwing Dec 18 '12 at 00:41
1
    Dim str As String = "Today, I ate a tomato and an orange"
    Dim sWords As String = "Orange Lemon Pepper Tomato"
    Dim sWordArray() As String = sWords.Split(" ")

    For Each sWord In sWordArray

        If str.ToLower.Contains(sWord.ToLower) Then
            Console.WriteLine(sWord)
        End If

    Next sWord
Ciarán
  • 3,017
  • 1
  • 16
  • 20
  • Don't use ToLower for string comparisons, use ToUpperInvariant or ToUpper(CultureInfo.CurrentCulture) instead. See: http://stackoverflow.com/questions/2801508/what-is-wrong-with-tolowerinvariant or use IndexOf(StringComparisonType)>-1 http://msdn.microsoft.com/en-us/library/ms224425(v=vs.95).aspx – jessehouwing Dec 18 '12 at 00:43