0

I am working with two lists. The first contains a large sequence of strings. The second contains a smaller list of strings. I need to find where the second list exists in the first list.

I worked with enumeration, and due to the large size of the data, this is very slow, I was hoping for a faster way.

    List<string> first = new List<string>() { "AAA","BBB","CCC","DDD","EEE","FFF" };

    List<string> second = new List<string>() { "CCC","DDD","EEE" };

int x = SomeMagic(first,second);

And I would need x to = 2.

Jeya Suriya Muthumari
  • 1,947
  • 3
  • 25
  • 47
blfoleyus
  • 3
  • 3

4 Answers4

1

Ok, here is my variant with old-good-for-each-loop:

private int SomeMagic(IEnumerable<string> source, IEnumerable<string> target)
{
    /* Some obvious checks for `source` and `target` lenght / nullity are ommited */

    // searched pattern
    var pattern = target.ToArray();
    // candidates in form `candidate index` -> `checked length`
    var candidates = new Dictionary<int, int>();
    // iteration index
    var index = 0;

    // so, lets the magic begin
    foreach (var value in source)
    {
        // check candidates
        foreach (var candidate in candidates.Keys.ToArray()) // <- we are going to change this collection
        {
            var checkedLength = candidates[candidate];
            if (value == pattern[checkedLength]) // <- here `checkedLength` is used in sense `nextPositionToCheck`
            {
                // candidate has match next value
                checkedLength += 1;
                // check if we are done here
                if (checkedLength == pattern.Length) return candidate; // <- exit point
                candidates[candidate] = checkedLength;
            }
            else
                // candidate has failed
                candidates.Remove(candidate);
        }

        // check for new candidate
        if (value == pattern[0])
            candidates.Add(index, 1);
        index++;
    }

    // we did everything we could
    return -1;
}

We use dictionary of candidates to handle situations like:

var first = new List<string> { "AAA","BBB","CCC","CCC","CCC","CCC","EEE","FFF" };
var second = new List<string> { "CCC","CCC","CCC","EEE" };
vasily.sib
  • 3,871
  • 2
  • 23
  • 26
1

If you are willing to use MoreLinq then consider using Window:

var windows = first.Window(second.Count);
var result = windows
                .Select((subset, index) => new { subset, index = (int?)index })
                .Where(z => Enumerable.SequenceEqual(second, z.subset))
                .Select(z => z.index)
                .FirstOrDefault();

Console.WriteLine(result);
Console.ReadLine();

Window will allow you to look at 'slices' of the data in chunks (based on the length of your second list). Then SequenceEqual can be used to see if the slice is equal to second. If it is, the index can be returned. If it doesn't find a match, null will be returned.

mjwills
  • 23,389
  • 6
  • 40
  • 63
0

Implemented SomeMagic method as below, this will return -1 if no match found, else it will return the index of start element in first list.

private int SomeMagic(List<string> first, List<string> second)
{
    if (first.Count < second.Count)
    {
        return -1;
    }

    for (int i = 0; i <= first.Count - second.Count; i++)
    {
        List<string> partialFirst = first.GetRange(i, second.Count);
        if (Enumerable.SequenceEqual(partialFirst, second))
            return i;
    }

    return -1;
}
mjwills
  • 23,389
  • 6
  • 40
  • 63
Raka
  • 427
  • 2
  • 10
-1

you can use intersect extension method using the namepace System.Linq

var CommonList = Listfirst.Intersect(Listsecond)
Atul Mathew
  • 445
  • 3
  • 15
  • This only shows me the items common to both lists. I'm searching large sequences of data looking for patterns. – blfoleyus May 27 '19 at 04:50
  • i think you are asking for something like this do checkout:https://stackoverflow.com/questions/28301074/linq-compare-two-lists-and-count-subset – Atul Mathew May 27 '19 at 05:01