1

I am trying to sort all the updated item in DataTableA, by coloring the item that has not been completely updated, and removing the item that has been updated completely from the DataTable. both The item that has been updated completely and the incomplete updated item are in "managed" table in the database, the discharge date will be null if it has not been completely updated.

Below code works but it can take all day for the page to run. This is C# webform.

The code below is writing on my code behind file:

 foreach (GridDataItem dataItem in RadGrid1.Items)
  {
    var panu = dataItem["Inumber"];
    var panum = panu.Text;
    var _cas = db.managed.Any(b =>
        b.panumber == panum && b.dischargedate != null);                
    var casm = db.managed.Any(b => 
        b.panumber == panum && b.dischargedate == null);
   if (_cas == true)
     {
        dataItem.Visible = false;
        
      }
   if (casm == true)
     {
        dataItem.BackColor = Color.Yellow;
     }

 }
  • So `db.managed` is a `DbSet`? – vgru Sep 29 '22 at 14:18
  • Yes, is a DbsSet .managed is A Table in the database i use db.manaeged to get data from the table managed – PATRICIA KAREN Sep 29 '22 at 14:21
  • 1
    Ok so this basically means that every loop iteration will be a database call, which will be quite slow. And how many items are there in the database in total? Is it feasible to load `dischargedate` for all the items into memory before doing this? – vgru Sep 29 '22 at 14:22
  • it sounds like your UI code is both generating multiple database queries and UI updates for every single item in the table. You may need to consider redesigning your code to minimize both. Generally you just want your UI code to be responsible for rendering data (and capturing user actions/commands)--and to delegate business logic and data access to supporting code. – STW Sep 29 '22 at 14:23
  • for now the number of item in the database, it's about 900, but the number is expected to increase every daily – PATRICIA KAREN Sep 29 '22 at 14:29
  • 1
    Ah so the items in the UI are being repeated many times? Since you wrote there are 20,000 items, but the database only contains 900? These details are important to figure out how to balance between loading everything and loading only what you need. – vgru Sep 29 '22 at 15:19
  • 1
    Some good answers already, but also consider whether it would be possible to modify the result at the database level. Either add a computed column, or create a view, etc, depending on what makes the most sense for your use case. – BurnsBA Sep 29 '22 at 15:23
  • ok, here the item in RadGrid1.Items,(the table) is gotten by joinini 4 table like this, var radgrideTableItem = ( from o in db.taxTables join p in db.users on o.userid equals p.useridtin join like 4 more table here. where o.servicetype == "FirGradeTaxs" select new { o.Inumber, b.lcode, b.lastname, b.firstname, p.uyname, a.ksjsva o.mjjwwdate sc.desc }); the item here is about 20, 000, bind in grid – PATRICIA KAREN Sep 30 '22 at 07:22

2 Answers2

1

Well, 900 items might be worth simply fetching to a list in memory and then process that. It will definitely be faster, although it consumes more memory.

You can do something like this (assuming the type of managed is Managed):

List<Managed> myList = db.managed.ToList();

That will fetch the whole table.

Now replace your code with:

foreach (GridDataItem dataItem in RadGrid1.Items)
  {
    var panu = dataItem["Inumber"];
    var panum = panu.Text;
    var _cas = myList .Any(b =>
        b.panumber == panum && b.dischargedate != null);                
    var casm = myList .Any(b => 
        b.panumber == panum && b.dischargedate == null);
   if (_cas == true)
     {
        dataItem.Visible = false;
        
      }
   if (casm == true)
     {
        dataItem.BackColor = Color.Yellow;
     }

 }

You should see a huge performance approvement.

Another thing: You don't mention what database you're using, but you should make sure the panumber column is properly indexed.

Poul Bak
  • 10,450
  • 5
  • 32
  • 57
  • ok the database is sql thanks – PATRICIA KAREN Sep 29 '22 at 15:07
  • One thing to consider, when choosing the method is: How many simultaneous runs of this code? If it's some admin code, with only one user at a time, using much memory is not really an issue, but if thousands run this code at the same time, it will of course sum up. – Poul Bak Sep 29 '22 at 15:40
  • about 100 people may run the code at the same time, that is the maximum – PATRICIA KAREN Sep 29 '22 at 15:47
  • I suggest you measure the memory in use and multiply by 100, then you know if it's acceptable. – Poul Bak Sep 29 '22 at 16:01
  • have you tried to run the code? If so how long time did it take? – Poul Bak Sep 29 '22 at 16:07
  • 1
    Regarding memory usage and performance, you should preferably only get the columns you need from the database, instead of loading all entities. For example, `db.managed.Select(x => new { x.panumber, x.dischargedate }).ToList()` will only fetch these two columns. If you do `db.managed.ToList()`, EF core will load all those entities with all properties, with tracking enabled. Most of the queries in EF Core should be projections (`.Select`, or mapping to DTOs using automapper or similar tool), or at least using `.AsNoTracking()` to avoid the overhead of entity change tracking. – vgru Sep 29 '22 at 16:09
  • @Groo: You're right, my code was mainly meant as a simple starting base to see how much time it now takes. Depending on how many columns there are, your code can save some memory. – Poul Bak Sep 29 '22 at 16:12
1

As mentioned in the comment, each call to db.managed.Any will create a new SQL query.

There are various improvements you can make to speed this up:

  1. First, you don't need to call db.managed.Any twice inside the loop, if it's checking the same unique entity. Call it just once and check dischargedate. This alone with speed up the loop 2x.

     // one database call, fetching one column
     var dischargedate = db.managed
         .Select(x => x.dischargedate)
         .FirstOrDefault(b => b.panumber == panum);
    
     var _cas = dischargedate != null;
     var casm = dischargedate == null;
    
  2. If panumber is not a unique primary key and you don't have a sql index for this column, then each db.managed.Any call will scan all items in the table on each call. This can be easily solved by creating an index with panum and dischargedate, so if you don't have this index, create it.

  3. Ideally, if the table is not huge, you can just load it all at once. But even if you have tens of millions of records, you can split the loop into several chunks, instead of repeating the same query over and over again.

  4. Consider using better naming for your variables. _cas and casm are a poor choice of variable names.

    Pro tip: Always code as if the person who ends up maintaining your code is a violent psychopath who knows where you live.

So if you don't have hundreds of thousands of items, here is the simplest fix: load panumber and discharge values for all rows from that table into memory, and then use a dictionary to instantly find the items:

// load all into memory
var allDischargeDates = await db.managed
    .Select(x => new { x.panumber, x.dischargedate })
    .ToListAsync(cancellationToken);

// create a dictionary so that you can quickly map panumber -> dischargedate
var dischargeDateByNumber = dbItems
    .ToDictionary(x => x.panumber, x => x.dischargedate);
    
foreach (var dataItem in RadGrid1.Items)
{
    var panu = dataItem["Inumber"];
    var panum = panu.Text;
    
    // this is very fast to check now
    if (!dischargeDateByNumber.TryGetValue(panum, out DateTime? dischargeDate))
    {
        // no such entry - in this case your original code will just skip the item
        return;            
    }
    
    if (dischargeDate != null)
    {
        dataItem.Visible = false;
    }
    else
    {
        dataItem.BackColor = Color.Yellow;
    }
}

If the table is huge and you only want to load certain items, you would do:

// get the list of numbers to fetch from the database
// (this should not be a large list!)
var someList = RadGrid1
   .Items
   .Select(x => x["Inumber"].Text)
   .ToList();

// load these items into memory
var allDischargeDates = await db.managed
    .Where(x => someList.Contains(x.panumber))
    .Select(x => new { x.panumber, x.dischargedate })
    .ToListAsync(cancellationToken);

But there is a limit on how large someList can be (you don't want to run this query for a list of 200 thousand items).

vgru
  • 49,838
  • 16
  • 120
  • 201