I am building a natural language processor in C#, and many 'words' in our database are actually multiple-word phrases that refer to one noun or action. Please, no discussion on this design call, suffice it to say it is not changeable at this time. I have string arrays of related words (chunks) of the sentence that I need to test for these phrases and words. What is an appropriately idiomatic way to handle sub-array extraction so I run the least risk of overflow errors and the like?
To give an example of the desired logic, let me step through a run with a sample chunk. For our purposes, assume that the only multiple-word phrase from the database is 'quick brown'.
Full phrase: The quick brown fox -> encoded as {"The", "quick", "brown", "fox"}
First iteration: Test "The quick brown fox" -> returns nothing
Second iteration: Test "The quick brown" -> returns nothing
Third iteration: Test "The quick" -> returns nothing
Fourth iteration: Test "The" -> returns value
Fifth iteration: Test "quick brown fox" -> returns nothing
Sixth iteration: Test "quick brown" -> returns value
Seventh iteration: Test "fox" -> returns value
Sum all returned values and return.
I have some ideas of how to go about this but the more I look at things the more I am really getting worried about array addressing errors and other such horrors plaguing my code. The phrase is coming in as a string array, but I'm fine with putting it to IEnumerable. My only concern there lies in an Enumerable's lack of an index.