2

I have a

public static string[] words = {"word1","word2","word3"};

I want to count occurences of word1 + occurences of word2+ occurences of word3 in a string.

I tried

Regex.Matches(string, "word1").Count 

which works fine for a single word but i don't know how to search for all strings. I dont't want to use foreach because the array "words" can contains up to 25 strings. Thanks.

Rufus L
  • 36,127
  • 5
  • 30
  • 43
  • 1
    You'll have to use some kind of a loop. Foreach is hardly the worst of them. – 15ee8f99-57ff-4f92-890c-b56153 May 24 '19 at 20:53
  • I want to count of all words in the array. I don't care about individual counts. – Claudinho18 May 24 '19 at 21:05
  • I would join all the words in the array with the alternation character `|`, surround it with parenthesis, then surround that with word boundary's, i.e. `\b(word1|word2|word3|..)\b` then use a global findall type function. The size of the array will tell you the count. Or you could just do a single match using `(?s)(?:.*?\b(word1|word2|word3|..)\b)+`, then get the Capture Collection size for group 1, same count. –  May 24 '19 at 21:19

3 Answers3

3

This is a more versatile way to do this.
Regex gives you more control over the context of the words it finds.
And, I'm guessing it's a lot faster, since it does it all in one shot
without a lot of primitives manipulation.

string[] words = { "word1", "word2", "word3" };
Regex rx = new Regex(   @"(?is)(?:.*?\b(" + string.Join("|", words) +   @")\b)+");

string strin = "There are some word3 and more words and word1 and more word3, again word1";

Match m = rx.Match( strin );
if ( m.Success )
    Console.WriteLine("Found {0} words", m.Groups[1].Captures.Count);

Output

Found 4 words


The regex above uses the word boundary \b.
Alternative boundary choice: Whitespace (?<!\S) (?!\S)

  • Its much faster than using the Regex.Matches() which is slower than dirt. –  May 24 '19 at 22:11
2

You can utilize System.Linq to get the Sum of the Count of all the Matches by doing something like:

private static void Main()
{
    var words = new[] {"dog", "coyote", "fox"};

    var input = "The quick brown fox jumps over the lazy dog";

    var wordCount = words.Sum(word => Regex.Matches(input, word).Count);

    // wordCount = 2
}
Rufus L
  • 36,127
  • 5
  • 30
  • 43
0

Your best, maybe only, option is a loop that iterates through the list of words.

My preference is something like this:

int intTotalWordCount=0;

for (int intJ=0;intJ<words.Length;intJ++)
{
    intTotalWordCount+=Regex.Matches(string, words[intJ]).Count;
}

Console.WriteLine (@"Final word count = {0}",intTotalWordCount;

Of course, you could as well wrap the above block inside a method that has intTotalWordCount as its return value.

David A. Gray
  • 1,039
  • 12
  • 19