1

I have the following code which works fine and I am getting the required details as expected. This data is passed on to the front end to be displayed in a grid.

public object Test(SearchGridParam obj)
{
    List<GridModel> res = 
        (from data in _entity.TableNoTracking
        where (string.IsNullOrWhiteSpace(obj.Criticality) || data.CRITICALITY.Equals(obj.Criticality))
        && (string.IsNullOrWhiteSpace(obj.Search)
                || (!string.IsNullOrWhiteSpace(data.DESCRIPTION) && data.DESCRIPTION.ToUpper().Contains(obj.Search.ToUpper()))
                || (!string.IsNullOrWhiteSpace(data.NUMBER) && data.NUMBER.ToUpper().Contains(obj.Search.ToUpper()))
                )
        select new GridModel
        {
            ID = data.ID,
            NUMBER = data.NUMBER,
            CRITICALITY = data.CRITICALITY,
            WO_STATUS = data.WO_STATUS,
            ASSET_NUMBER = data.ASSET_NUMBER,
            DESCRIPTION = data.DESCRIPTION,
            HOURS = data.HOURS,
            START_DATE = data.START_DATE
        })
    .OrderByDescending(ord => ord.START_DATE)
    .ToList(); //Gives 300+ records depending on the filters

    int count = res.Count();

    return new Result
    {
        Result = res.Skip(obj.Skip).Take(obj.Take), // fetching only 10
        Count = count
    };
}

Context

As per requirement:

  • I need the Count after the "where" condition is applied. (I have achieved it using: int count = res.Count();)
  • In the final result set, I need only 10 records at a time. (Achieved using: res.Skip(obj.Skip).Take(obj.Take))
  • I am getting the data from the database (dbContext).

Question

As you can see, I just need 10 records using skip & take. But, just to get the count, I am having to use ToList() (which loads all data to the memory). Is there any other optimized way of doing it? OR is there any alternative approach to skip, take & count?

What I have tried

  • I have tried the following to improve the performance, but it takes the same time (or sometimes more that .ToList())
// *****code above is same*****
.OrderByDescending(ord => ord.START_DATE)
.AsQueryable();

int count = res.Count();

return new Result
{
    Result = res.Skip(obj.Skip).Take(obj.Take).ToList(),
    Count = count
};

Edit 1

I am fetching the data from Oracle DB

Sampath
  • 84
  • 1
  • 1
  • 10
  • 5
    _"I am having to use ToList()"_ - no, you don't, just use `Count()`. – Guru Stron Apr 14 '23 at 08:42
  • 2
    _"But, just to get the count, I am having to use ToList()"_ - [mre] whould be nice and code showing how do you measure this (there are some caveats when coming to the time measurement). But 300 hundreds of records is not that much so fetching them into memory can be faster than making 2 queries (depending on the actual setup). Also remove the `Select` and `OrderBy...`, they are not needed for the `Count`, apply them only for the final query. – Guru Stron Apr 14 '23 at 08:45
  • ist that what you are looking for ? .OrderByDescending(ord => ord.START_DATE) .Skip(obj.Skip) .Take(obj.Take) .ToList(); – jeb Apr 14 '23 at 08:52
  • 1
    `Contains` makes table scan when performing execution. So if your table is big, query will be slow. For fast searching you have to use [FREETEXT](https://learn.microsoft.com/en-us/sql/t-sql/queries/freetext-transact-sql?view=sql-server-ver16) – Svyatoslav Danyliv Apr 14 '23 at 09:34
  • In _what you have tried_ you add an `.AsQueryable()`. This looks suspicous to me and should not be needed. If you can't work without that, please post the whole function again, to let us find the error. Everyting else looks okay. – Oliver Apr 14 '23 at 10:51
  • @GuruStron In my original code -> If I remove the `.OrderByDescending(...).ToList()` -> And change the result to `Result = res.OrderByDescending(...).Skip(obj.Skip).Take(obj.Take).ToList()`. The execution time almost doubles in my case because both `Count()` and `ToList()` are taking equal time to process. – Sampath Apr 14 '23 at 11:46
  • @Sampath how do you measure the time? Where is the app located compared to the database? – Guru Stron Apr 14 '23 at 11:48
  • Consider using Keyset Pagination (paging by key) instead of Rowset Pagination (paging by rownumber), see https://stackoverflow.com/questions/70519518/is-there-any-better-option-to-apply-pagination-without-applying-offset-in-sql-se/70520457#70520457 – Charlieface Apr 14 '23 at 12:07
  • @GuruStron _how do you measure the time?_ : I used `Stopwatch()`. However, even that is not required because the delay is clearly visible during debugging. _Where is the app located compared to the database?_ : I am not sure of the exact location but it is not in my local system. I don't have direct access to the DB. – Sampath Apr 14 '23 at 12:22
  • How many times you have run the measurements before restarting the app? _"I am not sure of the exact location"_ - that can be a crucial thing here which can make your code run fast or slow. – Guru Stron Apr 14 '23 at 12:30
  • 1
    Use a library like [PagedList](https://www.nuget.org/packages/PagedList.Core/). Give it an `IQueryable` and it automatically executes a Skip/Take and a `Count()` query on it, both of which run in the database. – Gert Arnold Apr 14 '23 at 12:58

1 Answers1

1

The result of using Linq differs depending upon whether your using a Data connection like EntityFramework or similar or not. In your case, it may seem as though you're using EntityFramework because you are querying from _entity.TableNoTracking, but as soon as you use a terminal operation, such as ToList(), then the rest of the Linq expression uses Linq to Objects which operates in memory.

So, when you do your query (all of it before ToList), LINQ will build a database query. After ToList, it does that in memory, so .Count is done in memory.

This is important, especially with Skip, Take and count. In fact, if you use Skip and Take to query 10 records using the database, EntityFramewokr will build an SQL statement that does that - it's the database that selects the 10 records, and LINQ never iterates the whole record set. When you come to count, that's done in memory and you always get 10 or less.

I would suggest that the right way to do this (if you want the actual recordset size) is to do the following:

var corequery = (from data in _entity.TableNoTracking where ...  select ... ).AsQueryable();

var records = corequery.OrderBy().Skip().Take().ToList();

var countofrecords = corequery.Count();

var countofrecordsreturned = records.Count();

Considering the above, Skip and Take form a part of the database query due to appearing before the terminal. A second Count query is also sent to the server, but that constructs a "SELECT Count(*)" kind of query and only returns an integer. You then have the ability to count the number of records in the recordset and seperately count how many records were returned. This obviously allows you to properly calculate how many pages of data you have.

If you're really bothered about roundtrip time of the second count query, I would suggest that you despatch the count query asyncronously. e.g.

    var tcount = query.CountAsync();
    var data = await query.ToListAsync();
    var count await tcount;

The longer data query entirely masks the interconnect overhead of the count query, effectively, barring scheduling and calculation in the database engine, you would get the count for free.

Mark Rabjohn
  • 1,643
  • 14
  • 30
  • Thanks for your input. I have a follow-up doubt/observation. In my original LINQ query, the `ToList()` is taking x seconds. When I implemented the code as you suggested; the line _records_ is taking x seconds & _countofrecords_ is taking x seconds again. Effectively, almost doubling the execution time. Is this a symptom of a different issue entirely? (Ps: I did not use any _stopWatch_ here, but while debugging, the pause is clear. Also, when observed from the UI end, the delay is very apparent) – Sampath Apr 16 '23 at 05:37
  • 1
    It all depends upon pooling, database locks, query cache and the capability of the database to execute these two queries in parallel. I generally use either Microsoft SQL or SAP Hana DB, SQL tends to give you the count seemingly for free - HANA can pause whilst it does a Prepare Statement, but the next time you need to do this when the database is primed it can be much quicker. The important thing is that the CountAsync should return immediately, and the the await tcount should be immediate too - if that is not the case, then perhaps your database pooling config isn't letting you run parallel. – Mark Rabjohn Apr 17 '23 at 15:29
  • If the query is expensive to execute, lets say the underlaying tables contains millions of rows and the result set is small, executing it twice is not a good option, even if done in parallel. On the other hand if the query is quick but the result set is large the oposite would be true. – Magnus Apr 18 '23 at 07:05