0

I seem to have written some very slow piece of code which gets slower when I have to deal with EF Core.

Basically I have a list of items that store attributes in a Json string in the database as I am storing many different items with different attributes.

I then have another table that contains the display order for each attribute, so when I send the items to the client I am order them based on that order.

It is kinda slow at doing 700 records in about 18-30 seconds (from where I start my timer, not the whole block of code).

var itemDtos = new List<ItemDto>();


            var inventoryItems = dbContext.InventoryItems.Where(x => x.InventoryCategoryId == categoryId);

            var inventorySpecifications = dbContext.InventoryCategorySpecifications.Where(x => x.InventoryCategoryId == categoryId).Select(x => x.InventorySpecification);



            Stopwatch a = new Stopwatch();
            a.Start();

            foreach (var item in inventoryItems)
            {
                var specs = JObject.Parse(item.Attributes);
                var specDtos = new List<SpecDto>();

                foreach (var inventorySpecification in inventorySpecifications.OrderBy(x => x.DisplayOrder))
                {
                    if (specs.ContainsKey(inventorySpecification.JsonKey))
                    {

                        var value = specs.GetValue(inventorySpecification.JsonKey);

                        var newSpecDto = new SpecDto()
                        {
                            Key = inventorySpecification.JsonKey,
                            Value = displaySpec.ToString()
                        };

                        specDtos.Add(newSpecDto);

                    }
                }

                var dto = new InventoryItemDto()
                {
                    // create dto
                };

                inventoryItemDtos.Add(dto);
            }

Now it goes crazy slow when I add EF some more columns that I need info from.

In the //create dto area I access some information from other tables

     var dto = new InventoryItemDto()
           {
               // access brand columns
               // access company columns
               // access branch columns
               // access country columns
               // access state columns
           };

By trying to access these columns in the loop takes 6mins to process 700 rows.

I don't understand why it is so slow, it's the only change I really made and I made sure to eager load everything in.

To me it almost makes me think eager loading is not working, but I don't know how to verify if it is or not.

   var inventoryItems = dbContext.InventoryItems.Include(x => x.Branch).ThenInclude(x => x.Company)
                                                    .Include(x => x.Branch).ThenInclude(x => x.Country)
                                                    .Include(x => x.Branch).ThenInclude(x => x.State)
                                                    .Include(x => x.Brand)
                                                    .Where(x => x.InventoryCategoryId == categoryId).ToList();

so I thought because of doing this the speed would not be that much different then the original 18-30 seconds.

I would like to speed up the original code too but I am not really sure how to get rid of the dual foreach loops that is probably slowing it down.

chobo2
  • 83,322
  • 195
  • 530
  • 832
  • To help you better understand what EF is running behind the scenes add some logging in to expose the SQL being run which might help you see how/where your queries are going wrong. https://learn.microsoft.com/en-us/ef/core/miscellaneous/logging – Henry Nov 01 '18 at 14:01
  • Have you had any luck with any of the answers given by everyone here? – Henry Nov 08 '18 at 11:29

3 Answers3

2

First, loops inside loops is a very bad thing, you should refactor that out and make it a single loop. This should not be a problem because inventorySpecifications is declared outside the loop

Second, the line
var inventorySpecifications = dbContext.InventoryCategorySpecifications.Where(x => x.InventoryCategoryId == categoryId).Select(x => x.InventorySpecification);
should end with ToList(), because it's enumerations is happening within the inner foreach, which means that the query is running for each of "inventoryItems"

that should save you a good amount of time

Leonardo
  • 10,737
  • 10
  • 62
  • 155
  • Can you expand on your first point. I am not really seeing how to extract it out as each item has to go through and be checked. – chobo2 Oct 30 '18 at 16:39
  • Also to create take my InventoryItemDto it takes 500 milliseconds to make, what I find is super slow. I don't understand why as like I said I am eager loading. – chobo2 Oct 30 '18 at 17:15
  • "Loops inside of loops is a very bad thing..." For what reason? – Kenneth K. Nov 01 '18 at 15:26
  • loops within loops are On^2 in complexity, meaning that you will execute the inner loop code exponentially number of times... I recommend to always try to avoid it and refactor it in some other way... – Leonardo Nov 01 '18 at 16:53
  • check you this: https://softwareengineering.stackexchange.com/questions/199196/why-are-nested-loops-considered-bad-practice – Leonardo Nov 01 '18 at 17:15
0

I'm no expert but this part of your second foreach raises a red flag: inventorySpecifications.OrderBy(x => x.DisplayOrder). Because this is getting called inside another foreach it's doing the .OrderBy call every time you iterate over inventoryItems.

Before your first foreach loop, try this: var orderedInventorySpecs = inventorySpecifications.OrderBy(x => x.DisplayOrder); and then use foreach (var inventorySpec in orderedInventorySpecs) and see if it makes a difference.

Christopher Lake
  • 358
  • 4
  • 10
  • You could also filter the results from the `orderedInventorySpec` such as `orderedInventorySpecs.Where(x => specs.ContainsKey(inventorySpecification.JsonKey))` to avoid looping over all of the records? – Simply Ged Oct 30 '18 at 22:38
0

To help you better understand what EF is running behind the scenes add some logging in to expose the SQL being run which might help you see how/where your queries are going wrong. This can be extremely helpful to help determine if your queries are hitting the DB too often. As a very general rule you want to hit the DB as few times as possible and retrieve only the information you need via the use of .Select() to reduce what is being returned. The docs for the logging are: http://learn.microsoft.com/en-us/ef/core/miscellaneous/logging

I obviously cannot test this and I am a little unsure where your specDto's go once you have them but I assume they become part of the InventoryItemDto?

var itemDtos = new List<ItemDto>();

var inventoryItems = dbContext.InventoryItems.Where(x => x.InventoryCategoryId == categoryId).Select(x => new InventoryItemDto() {
    Attributes = x.Attributes,
    //.....
    // access brand columns
   // access company columns
   // access branch columns
   // access country columns
   // access state columns
}).ToList();


var inventorySpecifications = dbContext.InventoryCategorySpecifications
.Where(x => x.InventoryCategoryId == categoryId)
.OrderBy(x => x.DisplayOrder)
.Select(x => x.InventorySpecification).ToList();



foreach (var item in inventoryItems)
{
    var specs = JObject.Parse(item.Attributes);
    // Assuming the specs become part of an inventory item?
    item.specs = inventorySpecification.Where(x => specs.ContainsKey(x.JsonKey)).Select(x => new SpecDto() { Key = x.JsonKey, Value = specs.GetValue(x.JsonKey)});
}

The first call to the DB for inventoryItems should produce one SQL query that will pull all the information you need at once to construct your InventoryItemDto and thus only hits the DB once. Then it pulls the specs out and uses OrderBy() before materialising which means the OrderBy will be run as part of the SQL query rather than in memory. Both those results are materialised via .ToList() which will cause EF to pull the results into memory in one go.

Finally the loop goes over your constructed inventoryItems, parses the Json and then filters the specs based on that. I am unsure of where you were using the specDtos so I made an assumption that it was part of the model. I would recomend checking the performance of the Json work you are doing as that could be contributing to your slow down.

A more integrated approach to using Json as part of your EF models can be seen at this answer: https://stackoverflow.com/a/51613611/621524 however you will still be unable to use those properties to offload execution to SQL as accessing properties that are defined within code will cause queries to fragment and run in several parts.

Henry
  • 2,187
  • 1
  • 15
  • 28