1

I need your advice on the following.

I have a multi-dimensional IList containing items which have an index, Id and Text. Normally I know the value of Id and based on that I need to get the Text. Both Id and Text values are read from a database.

What we are currently using to get the value of Text field is:

foreach (Object myObj in List) 
{
    if (((MessageType)myObj).Id == id) 
    {
        return ((MessageType)myObj).Text;
    }
}

When count in IList becomes large (more than 32K), it takes some time to process.

Question: Is there a way to efficiently get the Text value without iterating through the IList?

Things I tried without success:

  • Use List.IndexOf(Id) - did not work because IndexOf applies to text only.
  • Converting List to multi-dimensional array - failed on List.CopyTo(array,0) my guess because it is multi-dimensional: string[] array=new string[List.Count,List.Count]; List.CopyTo(array,0);

I can not use a AJAX/JQuery solution because it is an existing(live) project and it will take too much to re-code.

Thanks

helb
  • 7,609
  • 8
  • 36
  • 58
All Blond
  • 802
  • 1
  • 8
  • 16
  • Are you doing this lookup multiple times on the same data set. Or just once? – Murdock Mar 25 '14 at 22:09
  • It is actually done multiple times for different ID because it basically feed menu and sub-menu of the site; depending on section of the site and specifics of the area where it got open. – All Blond Mar 25 '14 at 22:10
  • 2
    Your foreach psuedo code doesn't make a lot of sense to me. Where is the 'multi-dimensional' aspect there? It looks like a flat list. – Kevin Mar 25 '14 at 22:11
  • .Id == id) { return ((MessageType)myObj).Text; and id is not the index of the list as I mentioned in original post. – All Blond Mar 25 '14 at 22:12
  • Because it has an Id and Text property? Looks like a flat list of type MessageType to me. – Kevin Mar 25 '14 at 22:13
  • 1
    I agree with Kevin, it is unclear what `List` is here. Lists are not multi-dimensional, do you have a `List>`? Went to do some LINQ magic to this and realized there is not enough information to answer well. – Evan L Mar 25 '14 at 22:14
  • List is: system.collection.IList List with ID and TEXT values from database. Index to that list provided by Ilist. – All Blond Mar 25 '14 at 22:15
  • If it flat list can you or Kevin offer faster solution compare to what I have? – All Blond Mar 25 '14 at 22:18
  • @AllBlond I have posted a solution for a flat list, you could easily put said solution in a method that returns the text given an id. – Evan L Mar 25 '14 at 22:19
  • @helb has already gotten us beat. :) If you don't want to add them to a dictionary then Evan's solution will at least let you get out of your foreach as soon as you find the first match. The performance on it will be exactly the same as if you add a break in your foreach. – Kevin Mar 25 '14 at 22:20
  • 1
    @Kevin As soon as I read "32K", it was kinda obvious that a linear (in 1 or 2 dimensions) data structure will be no good here. – helb Mar 25 '14 at 22:22

3 Answers3

5

If you want fast searching by some identifier in a collection with 32k elements, you should use Dictionary<K,V> as your collection.

var dict = new Dictionary<IDType, MessageType>();

A Dictionary is basically a search tree where the elements are stored in a sorted way so an element with a specific key (in your case Id) can be found without looking at all elements. For more information see MSDN.

If you cannot refactor the collection to be a dictionary, you may initially fill the dictionary (slow) and then search in the dictionary (fast). This will only be faster if you do multiple searches before you fill the dictionary again, i.e. if your list does not change often.

foreach(object o in List)
{
    var msg = (MessageType)o;
    dict.Add(msg.Id, msg);
}

Searching then is easy:

MessageType msg = dict[id];

EDIT: Well, I was curious and wrote a test routine which compares the linear search and the dictionary approach. Here's what I used:

using System;
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;

namespace ConsoleApplication1
{
    class MessageType
    {
        public string Id;
        public string Text;
    }

    class Program
    {
        static void Main(string[] args)
        {
            var rand = new Random ();
            // filling a list with random text messages
            List<MessageType> list = new List<MessageType>();
            for (int i = 0; i < 32000; i++)
            { 
                string txt = rand.NextDouble().ToString();
                var msg = new MessageType() {Id = i.ToString(), Text = txt };
                list.Add(msg);
            }
            IList List = (IList)list;

            // doing some random searches
            foreach (int some in new int[] { 2, 10, 100, 1000 })
            {
                var watch1 = new Stopwatch();
                var watch2 = new Stopwatch();
                Dictionary<string, MessageType> dict = null;
                for (int i = 0; i < some; i++)
                {
                    string id = rand.Next(32000).ToString();
                    watch1.Start();
                    LinearLookup(List, id);
                    watch1.Stop();

                    watch2.Start();
                    // fill once
                    if (dict == null)
                    {
                        dict = new Dictionary<string, MessageType>();
                        foreach (object o in List)
                        {
                            var msg = (MessageType)o;
                            dict.Add(msg.Id, msg);
                        }
                    }
                    // lookup 
                    DictionaryLookup(dict, id);
                    watch2.Stop();
                }

                Console.WriteLine(some + " x LinearLookup took " 
                    + watch1.Elapsed.TotalSeconds + "s");
                Console.WriteLine("Dictionary fill and " + some 
                    + " x DictionaryLookup took " 
                    + watch2.Elapsed.TotalSeconds + "s");
            }
        }

        static string LinearLookup(IList List, string id)
        {
            foreach (object myObj in List)
            {
                if (((MessageType)myObj).Id == id)
                {
                    return ((MessageType)myObj).Text;
                }
            }
            throw new Exception();
        }

        static string DictionaryLookup(Dictionary<string, MessageType> dict,
            string id)
        {
            return dict[id].Text;
        }
    }
}

The results I got in Release / x86:

Number of | Time [ms] with | Time[ms] with | Speedup (approx.)
searches  | linear search  | dictionary(*) | with dictionary
----------+----------------+---------------+-----------------
      2   |      1.161     |   2.006       |   0.6
----------+----------------+---------------+-----------------
     10   |      2.834     |   2.060       |   1.4
----------+----------------+---------------+-----------------
    100   |     25.39      |   1.973       |   13
----------+----------------+---------------+-----------------
   1000   |    261.4       |   5.836       |   45
----------+----------------+---------------+-----------------

(*) including filling the dictionary once.

So, I was a bit optimistic to say that searching twice would already pay off. In my test application I have to search 10 times for the dictionary to be faster.

I'm sorry I could not make a more realistic example, my Ids are all sorted. Feel free to try modifying and experimenting though ;-)

helb
  • 7,609
  • 8
  • 36
  • 58
  • Sorry @helb your solution would make app even slower than it is now. – All Blond Mar 25 '14 at 22:25
  • 2
    @AllBlond that's simply not true. `Dictionary` is much faster than any `IList`. – Evan L Mar 25 '14 at 22:27
  • 3
    If you fill the dictionary once and do at least 2 searches it will already be much faster than searching the list twice. (Always assuming a LOT of elements) – helb Mar 25 '14 at 22:28
  • @helb please try it and report back. It should be faster if you are doing multiple lookups. – Kevin Mar 25 '14 at 22:30
  • I will try but from looks of it I will have a problem because 1.dictionary will get recreated on each call of that public string where I do search. 2. I will need to use IDictionary instead of dictionary – All Blond Mar 25 '14 at 22:33
  • @Kevin Well, here you go. I wrote some test code to compare the searches. No real surprises here, just that you have to search a few times in order for the dictionary to pay off. – helb Mar 25 '14 at 23:07
  • @helb very nice data! I already gave you +1 but I'd give you a couple more if I could. :) – Kevin Mar 25 '14 at 23:26
  • @helb nicely done, same as Kevin, I already upvoted but this deserves more. – Evan L Mar 26 '14 at 02:17
  • Nice solution for new module. I got existing one and for some reason it does not takes what you suggesting. Failing when I attempt to populate dictionary. – All Blond Mar 26 '14 at 14:21
  • @AllBlond Update your question, or (even better) ask a new one. Show how you try to populate the dictionary. – helb Mar 26 '14 at 14:22
  • @helb - found way to implement your suggestion with few tweaks into my app. Works as advertised. One of the tweaks is I put it into cache, which gave me additional speed since I only need to populate it once in awhile until cache expired. – All Blond Mar 26 '14 at 15:04
  • @AllBlond Great! Let me known your performance gain / application experience. – helb Mar 26 '14 at 15:08
  • @helb: And to be on the safe side I used failsafe try-catch. If dictionary failed for any reason catch will use original method. Worked like a charm. Users happy because it speed up process and I am happy because modifications lower processor usage by at list 20%. Percentage so high because that is not the only place where I replace that type of loop with dictionary solution. – All Blond Apr 10 '14 at 20:18
  • Quad processor I7 (meaning 4 I7 Processors) on web server was at 85%-90%, now it rarely peaks to 55% – All Blond Apr 10 '14 at 20:24
  • @AllBlond Wow, sounds like a happy ending :-) – helb Apr 10 '14 at 21:02
1

From the looks of it you have a List<MessageType> here, which is not multi-dimensional. Rather the objects inside the list have multiple properties.

You could easily get them out with LINQ much faster than a loop most likely:

var text = (from MessageType msgType in myList
            where msgType.Id == id
            select msgType.Text).FirstOrDefault();

Or even easier with an inline LINQ statement:

var text = myList.Where(s => s.Id == id).Select(s => s.Text).FirstOrDefault();

NOTE: As mentioned in comments above, the speed of these LINQ statements are only as good as the object's position in the List. If it is the last object in the list, you will likely see the same performance discrepancy. Dictionary<Index, MessageType> is going to be much more performant.

Evan L
  • 3,805
  • 1
  • 22
  • 31
  • Nope, this will not work because list is type of system.collection.IList and not a LINQ. Error from VS is: CAST not found. – All Blond Mar 25 '14 at 22:24
  • How is that faster? I'm just curious because I think this does the same as the OP posted. – helb Mar 25 '14 at 22:25
  • @AllBlond can you please post the actual implementation then? Your pseudo code is not clear. – Evan L Mar 25 '14 at 22:25
  • @helb I added a note to this explaining *when* it would be faster. Your solution is better, mine is just an alternative that may improve performance a bit. – Evan L Mar 25 '14 at 22:26
  • That was actual implementation of the public string. ID is passed to it and List declared as global variable and got filled elsewhere in app during onInit. – All Blond Mar 25 '14 at 22:28
  • @AllBlond if that's the *actual implementation* then you have compiler errors galore... I was saying can I see the declaration of `List` (bad variable name) – Evan L Mar 25 '14 at 22:29
  • Yes for new implementation @helb solution will work perfectly but I do have existing one, there is only one issue. It is very slow.I tried to implement dictionary, same error CAST method not found for IList. Following implementation failing: if (dict == null) { foreach (Object myObj in List) { var msg = (MessageInLanguage)myObj; dict.Add(msg.Id, msg.ToString()); on this line claims that msg.ToString() is null or not an object while it clearly have a value in watch list. – All Blond Mar 26 '14 at 14:18
  • shouldn't it be `msg.Text`? Also, you do not show a call to `Cast<>` in your code.. so we can't really do much to debug that... – Evan L Mar 26 '14 at 14:53
0

Better way is to use ILookup. For example:

var look = query.ToLookup(x => x.SomeID, y=> y.Name)

and use:

if (look.Contains(myID)){
   var name = look[myID].First();
}
Jacob Q
  • 61
  • 3