-7

I have a list (around 18000 items) of strings, well I need to find same substring in this list. Bellow is an example: List example: "test1" "test 2" "est 2" "west1"

well I need this result:

"test" - 2

"est" - 4

"est1" - 2

"est 2" - 2

well I need it using linq to make searching fast (if possible). Thanks in advance

Sergiu Cojocaru
  • 687
  • 6
  • 16
  • 1
    LINQ won't necessarily make the operation fast. LINQ is compiled into the appropriate for loops anyway... – MoonKnight May 02 '13 at 09:19
  • You'll find people more willing to help if you show [what you've tried so far](http://www.whathaveyoutried.com). – anaximander May 02 '13 at 09:20
  • I think any solution for *all* strings (like in your example) is likely to be O(N^2) so it's not going to be hugely fast. Checking each individual string will be O(N). Just to confirm: If we add to your sample list of strings the string "e", would the count for that be 5? (Because it's in "e", "test", "est", "est1" and "est 2") – Matthew Watson May 02 '13 at 09:22
  • You have a list of a lot of strings and want to detect (and count) common substrings? Good luck with that... basically you need to go through each and every string, build all possible substrings (<- hard part), probably use them as keys in a `Dictionary` and count the occurence of each key. This *will* get slow. – Corak May 02 '13 at 09:25
  • @MatthewWatson, generally you are right, if we add e in list I will need result e - 5. In real case I will ignore substrings with lenght < 3 – Sergiu Cojocaru May 02 '13 at 10:35

1 Answers1

4

I guess that this is what you want:

var listWithSubstring = originalList.Where(i => i.Contains("est"));
Zbigniew
  • 27,184
  • 6
  • 59
  • 66
  • maybe .Count would be enough but this will cover it also – WhileTrueSleep May 02 '13 at 09:19
  • Well, I'm not really sure what @SergiuCojocaru wants, but this should be enough. Though you may be right, that I could mention `.Count`/`.Count()` in my answer. – Zbigniew May 02 '13 at 09:22
  • well string may or may not contain est. I need to find all similar substrings. For example if in above list I will add "home" and "omen", algorithm will need to show previous results + ome -2. So in fact I do not know the initial list on coding stage. – Sergiu Cojocaru May 02 '13 at 10:34
  • @SergiuCojocaru then you want to find common substrings, which is too *big* (at least without some code from you) to answer in this question. However I can direct you to [this question](http://stackoverflow.com/questions/2418504/algorithm-to-find-common-substring-across-n-strings) – Zbigniew May 02 '13 at 14:12