0

I am working on an application where a user can filter records by categories.

A record must have 1 but can have many categories. (1-*)

The question I have is, What can I do to improve this search? It currently runs at about an O(n^3) for the following reason:

A transaction may have many records (1-many) A record may have many categories (1-many) Iterate through the categories selected for searching

The basic outline of what I am doing is the following:

retrieve all announcements

if(startDate and endDate have values)
    model = model.Where(x => x.WADate >= start.Value && x.WADate <= end.Value).ToList();

if(queryString is not null)
    model = model.Where(
                        x => x.WANum.Contains(query) ||
                        x.Tooltip.ToLower(CultureInfo.InvariantCulture).Contains(query)
                        || x.Topic.ToLower(CultureInfo.InvariantCulture).Contains(query)
                    ).ToList();

if (selectedCategories.Count > 0)
        {
            bool HasMatch;
            foreach (var ancmt in announcements)
            {
                HasMatch = false;
                foreach (var cat in selectedCategories)
                {
                    foreach (var xref in ancmt.waXref)
                    {
                        if (cat.ID == xref.WACategoryID)
                        {
                            HasMatch = true;
                        }
                    }
                }
                if(HasMatch)
                {
                    model.Add(new EditViewModel
                    {
                        WATypeID = ancmt.WATypeID,
                        WANum = ancmt.WANum,
                        WATypeName = ancmt.waType.WATypeDescription,
                        Link = ancmt.Link,
                        Tooltip = ancmt.Tooltip,
                        Topic = ancmt.Topic,
                        WADate = ancmt.WADate,
                        WAID = ancmt.WAID,
                    });

                    ancmt.waXref.ToList().ForEach(
                        x => model.Last().Categories.Add(
                            new CategoryViewModel { ID = x.WACategoryID, Name = x.waCategory.WACategory, IsSelected = false }));
                }
            }
        }
        // If no catgories were selected, keep all announcements for next stage of search
        else
        {
            foreach (var ancmt in announcements)
            {
                model.Add(new EditViewModel
                {
                    WATypeID = ancmt.WATypeID,
                    WANum = ancmt.WANum,
                    WATypeName = ancmt.waType.WATypeDescription,
                    Link = ancmt.Link,
                    Tooltip = ancmt.Tooltip,
                    Topic = ancmt.Topic,
                    WADate = ancmt.WADate,
                    WAID = ancmt.WAID,
                });

                ancmt.waXref.ToList().ForEach(
                    x => model.Last().Categories.Add(
                        new CategoryViewModel { ID = x.WACategoryID, Name = x.waCategory.WACategory, IsSelected = false }));
            }
        }

I am using Method syntax, not query and prefer to stay in method syntax.

Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
Ingenioushax
  • 718
  • 5
  • 20
  • 6
    This doesn't look like a Linq query to me. – Robert Harvey Jun 21 '16 at 16:08
  • It's difficult to tell from just the code what the expected outcome of the "query" should be. It looks like you're just setting a match flag on each record. Instead of doing that, why not run an actual Linq query? – Robert Harvey Jun 21 '16 at 16:10
  • Real code added @RobertHarvey – Ingenioushax Jun 21 '16 at 16:14
  • What are your performance measurements telling you is the hot spot in the code, i.e. that part of the code that needs to be optimized? – Robert Harvey Jun 21 '16 at 16:18
  • Hmmm... Good question. I don't have any metrics. Can you point me in right direction to obtain some performance metrics? – Ingenioushax Jun 21 '16 at 16:19
  • Use a profiler or put in some StopWatches. – Robert Harvey Jun 21 '16 at 16:20
  • @RobertHarvey, here is the metrics I received from a few runs. The runs are variations of all possible queries, with fairly little variation in run times. This is operating against approximately 40 test items. Run 1: 422ms Run 2: 298ms Run 3: 428ms Run 4: 299ms Run 5: 427ms Run 6: 370ms – Ingenioushax Jun 21 '16 at 16:33
  • Looks like "UI-chaotic data passed to some ninja logic and back to UI somehow whatever"-style to me. – Stefan Steinegger Jun 21 '16 at 16:34
  • OK. Where is the majority of time being spent? Which line(s) of code takes the most time? – Robert Harvey Jun 21 '16 at 16:34
  • I'm going to say that the n^3 loop structure is where the most time is taking place. @StefanSteinegger, I have a view model for the View, which is why the data "looks" chaotic. It needs some refactoring, but no time to refactor when you have 3 weeks between day 1 and deployment. – Ingenioushax Jun 21 '16 at 16:39
  • Post your sample data (your 40 test items). Include in your question the expected output. – Robert Harvey Jun 21 '16 at 16:48
  • That's not really an option. Considering the data can be searched for in a variety of ways, the "expected" output can vary. However, the following is always true about the result set. 1. A user can search for a string. The string may or may not exist in the record. 2. A user can narrow (filter) results by categories; there are 10 3. A user can narrow the range of the result set by date. 4. A user may use one or any combination of the above. – Ingenioushax Jun 21 '16 at 19:41

1 Answers1

1

For any real performance gains it would be best to create specific Stored Procedures to handle the true heavy lifting processing in the database and not on the client PC.

ΩmegaMan
  • 29,542
  • 12
  • 100
  • 122