0

I am using MVC4 with Entity Framework 5 for a project. We have a master table named MAIN_TABLE and have few child tables (ie: CHILD_TABLE1, CHILD_TABLE2, etc..)

We are facing a problem of the speed of LINQ query execution because of few filter option in these different Child Tables.

I have to write a query for filtering data from the Model using EF5. Right now we are coded based on the single filter at a time. i.e. We are checking with each filtered columns and firing the query. But its too slow. Is there any another option?

string[] strValue = filter_Values;
foreach (SelectedData selectedData in objSelectedDataCollection.DataCollection)
{
    switch (selectedData.ColumnName) // This is the column name in the GridView defined
    {
        case "Outlook":
            indlist = from jd in indlist 
                      where jd.IND_APP_PASS_STATUS.Any(
                                                  ob => strValue.Contains(ob.Outlook))
                      orderby jd.Indman_ID
                      select jd;
            break;
        case "RS_TP":
            indlist = from jd in indlist
                      where jd.IND_APP_PASS_STATUS.Any(
                                                  ob => strValue.Contains(ob.RS_TP))
                      orderby jd.Indman_ID 
                      select jd;
            break;
        case "Code":
            indlist = (from jd in indlist from jk in jd.IND_APP_PASS_STATUS where strValue.Contains(jk.Code) select jd).ToList();
            break;
    }
}
Pavel Voronin
  • 13,503
  • 7
  • 71
  • 137
JKANNAN
  • 83
  • 1
  • 10
  • Hmm doesn't `string.Contains()` actually execute on your end and not on SQL server side? Anyone? If so that's why it's slow –  Feb 18 '15 at 14:10
  • 8
    Use a SQL profiler to see what the query being sent to the database engine is. You may need to write the SQL manually instead of using EF if you need to optimize the query. – JNYRanger Feb 18 '15 at 14:12
  • @MickyDuncan, Linq To Entities will translate `strValue.Contains` to the appropriate SQL `IN(...)` statement. – haim770 Feb 18 '15 at 14:44
  • 2
    you might need some indexes. – Daniel A. White Feb 18 '15 at 14:47
  • 1
    For little projects, I love EF for its simplicity. When I have to deal with more complex queries, I replace them by stored procedures. – Deblaton Jean-Philippe Feb 18 '15 at 14:51
  • 2
    The most useful thing to post is the final query and its execution plan. Without these, it's impossible to answer this question without knowing the table's schema, what indexes are defined and how many rows it contains. You also need to provide the mappings used - if `jd` uses inheritance, it may result in one or more joins in the final query. The first thing I'd look though is IND_APP_PASS_STATUS`. If it isn't covered by an index, the database will be forced to scan the entire table. – Panagiotis Kanavos Feb 18 '15 at 14:56
  • Agree with @DeblatonJean-Philippe. If your query is too slow, replace it with a stored procedure. That's what they're for. – Chris Pratt Feb 18 '15 at 15:23
  • Your code shows `indlist` being filtered, but it's not clear what `indlist` is. Therefore I can't tell whether the queries will run on a collection which has already been read from the DB, or on the DB itself. I'd include enough code to show how `indlist` is set and where your DbContext is queried. – Andy Nichols Feb 18 '15 at 15:43
  • @AndyNichols 'indlist' is already queried from the DB. IEnumerable indlist; indlist = db.IND_TABLE.ToList().Where(x => x.Package_No.Trim() != "").OrderBy(x => Int32.Parse(x.Package_No)).Select(x => x).ToList(); – JKANNAN Feb 19 '15 at 03:59
  • I agree with the poster on one of the other answers that the ToList() operator in the initialization for this filtering operation is going to force EF to materialize a potentially large number of objects, and the actual filtering is then going to occur in memory. If the number of materialized objects is large, this will be slow. Although one could use SQL Profiler to see this, I find that the EF Profiler from Hibernating Rhinos makes this type of performance debugging much easier since it directly shows you the code that causes a given query be passed to SQL Server for execution. – Terry Coatta Sep 12 '15 at 17:26

3 Answers3

1

There are two sides of EF performance - server and client (your application).

First, as was mentioned in the comments use SQL profiler to see how fast generated query executes.

Also pay attention to the number of returned records. Temporarily switching automatic changes detection off for the queries returning particularly large result set can give a substantial boost to performance.

Community
  • 1
  • 1
Pavel Voronin
  • 13,503
  • 7
  • 71
  • 137
1

From your comment the initial database query is

indlist = db.IND_TABLE
            .ToList()
            .Where(x => x.Package_No.Trim() != "")
            .OrderBy(x => Int32.Parse(x.Package_No))
            .Select(x => x)
            .ToList<IND_TABLE>();

The first .ToList() means that the whole of IND_TABLE will be returned from the database. All other filtering is then done in code. This is one of the reasons for poor performance - it's almost always better to filter on the DB than to return everything.

Also note that you are sorting your results multiple times. Firstly when setting up indlist, and then once in each iteration of objSelectedDataCollection.DataCollection. This is unnecessary and you should not sort at all until you are finished filtering. Maybe after your foreach loop you could have a line indlist = indlist.OrderBy(x => x.Indman_ID);

Putting all this together would give you the following.

var indlist = db.IND_TABLE
                .Where(x => x.Package_No.Trim() != "");

string[] strValue = filter_Values;
foreach (SelectedData selectedData in objSelectedDataCollection.DataCollection)
{
    switch (selectedData.ColumnName)
    {
        case "Outlook":
            indlist = indlist.Where(il => il.IND_APP_PASS_STATUS.Any(iaps => strValue.Contains(iaps.Outlook)));
            break;
        case "RS_TP":
            indlist = indlist.Where(il => il.IND_APP_PASS_STATUS.Any(iaps => strValue.Contains(iaps.RS_TP)));
            break;
        case "Code":
            indlist = indlist.Where(il => il.IND_APP_PASS_STATUS.Any(iaps => strValue.Contains(iaps.Code)));
            break;
    }
}

indlist = indlist.OrderBy(x => x.Indman_ID).ToList();

It's worth reading up on when Entity Framework creates the SQL to query the database (search for the term "deferred execution"). It only happens when you try to use the results - such as with ToList(), ToArray(), SingleOrDefault() and various other things. The line context.TableName.Where(some lambda expression); won't cause a DB query at that point. You can keep filtering using Where without a DB call, which is what it happening in the code I posted. The SQL will only be generated and the DB queried at the line indlist = indlist.OrderBy(il => il.Indman_ID).ToList();

Andy Nichols
  • 2,952
  • 2
  • 20
  • 36
  • There is a problem in the above answer , the above function returns zero element if I am not giving ToList() at end of filter section (inside Select case). – JKANNAN Feb 19 '15 at 12:48
  • In addition to being careful about when you materialize queries (which I agree has a huge performance impact when using EF), we have also observed that the Any() operator can sometimes generate very sub-optimal queries (particularly when the queryable it is applied to is complex). We have found that the Count() operator seems to be much more 'stable' in terms of query generation and have transformed all occurrences of Any() in our code into Count(). e.g. myQueryable.Any(x => p(x)) becomes myQueryable.Count(x => p(x)) > 0. – Terry Coatta Sep 12 '15 at 17:16
0

The issue was in the query which I written.

Please check the query below. Now I use 'join' instead of querying in the subtables.

NOTE: Answer is if we are writing query using Subtables then each parent row hit each subtable row in the database. and by this reason your code performance will reduce according to the number of rows.i.e. if subtable has 10 rows it will hit 10 times to the database. But if we are using 'join' between the two tables now that code will hit only once with each column.

var indlist = db.IND_TABLE
            .Where(x => x.Package_No.Trim() != "");

 string[] strValue = filter_Values;
 foreach (SelectedData selectedData in objSelectedDataCollection.DataCollection)
 {
    switch (selectedData.ColumnName)
    {
        case "Outlook":
            indlist = (from jd in indlist join ipas in dbContext.IND_APP_PASS_STATUS on jd.Indman_ID equals ipas.Indman_ID where (strValue.Contains(ipas.Outlook)) orderby jd.Indman_ID select jd).ToList<IND_TABLE>();
            break;

        case "RS_TP":
            indmanlist = (from jd in indmanlist join ipas in dbContext.IND_APP_PASS_STATUS on jd.Indman_ID equals ipas.Indman_ID where (strValue.Contains(ipas.RS_TP)) orderby jd.Indman_ID select jd).ToList<IND_TABLE>();
            break;

        case "Code":
            indmanlist = (from jd in indmanlist join ipas in dbContext.IND_APP_PASS_STATUS on jd.Indman_ID equals ipas.Indman_ID where (strValue.Contains(ipas.Code)) orderby jd.Indman_ID select jd).ToList<IND_TABLE>();
            break;
    }
}

I had used SQL Profiler to check the executions with database. Then I realize the real issue of hitting to the database.

JKANNAN
  • 83
  • 1
  • 10