1

I have EF with LINQ with SQL Server in my project a long time.

And I was searching for a way to improve a performance of my requests to DB. And I read a lot of about Dapper and Procedures, that it's faster than EF. I added Dapper to project, I added procedure.... but my tests showed strange results. EF and Dapper and stored procedures have almost the same results - there are no benefits.

First of all, I checked with a request which has a lot of Join. I got almost the same results between Dapper and Procedure and EF. Then I decided to do a test with one simple table without relations.

I have table ZipCodes. There are 43200 records.

ZipCode table

I made test by 1 000 records, 10 000 records and 43200 records, using EF, Dapper, stored procedure, and request in SQL Server.

Dapper

string query =
            "SELECT TOP (43200) [Zip]\r\n      ,[City]\r\n      ,[State]\r\n      ,[Latitude]\r\n      ,[Longitude]\r\n      ,[TimeZone]\r\n      ,[DST]\r\n  FROM [dbo].[ZipCodes]";

using (IDbConnection connection = new SqlConnection(_connectionString))
{
    var result = connection.QueryAsync<ZipCodes>(query).Result.ToList();
    return result;
}

EF

var zip = db.ZipCodes.AsNoTracking().Take(43200).ToList();

Stored procedure

ALTER PROCEDURE [dbo].[ZIPTest]
AS 
BEGIN 
    SELECT TOP (43200) 
        [Zip], [City], [State], [Latitude], [Longitude], [TimeZone], [DST] 
    FROM 
        [dbo].[ZipCodes] 
END

Request in SQL Server with time

SELECT GETDATE(); 

SELECT TOP (43200) 
    [Zip], [City], [State], [Latitude], [Longitude], [TimeZone], [DST]
FROM 
    [dbo].[ZipCodes]

SELECT GETDATE();

In the code I use Stopwatch

string first = "", second = "", third="";
System.Diagnostics.Stopwatch swatch = new System.Diagnostics.Stopwatch();
swatch = new Stopwatch();
swatch.Start(); Dapper request;

then

swatch.Stop();
first = swatch.Elapsed.ToString(@"m\:ss\.fff");
swatch = new Stopwatch();
swatch.Start();

And so on

Results: (in milliseconds)

                       1000     10000      43200
-------------------------------------------------
EF                      107      1085       4527
Dapper                  139      1084       4036
Stored procedure        129      1089       4519
SQL query                 8        17         60

Difference between EF, Dapper and a stored procedure are very small. Why is it so?

And why is the query in SQL Server so fast and requests from code 15-70 times slower?

Is it OK or no?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
olegl19
  • 11
  • 1
  • 3
  • Benchmarking is hard. For starters: was this a release build? Did you use benchmarking frameworks? And lots more... – TGlatzer Mar 17 '19 at 09:57
  • 1
    The raw SQL query and stored procedure are missing many things - the time it takes to actually transfer the data from the server to your C# code, the time to write and then execute the model classes, the iterating over the result sets, the converting the raw SQL data into C# classes and instances - this is the nitty gritty "glue" code that both EF and Dapper will take care of for you. You're comparing apples to elephants here - not a very fair and realistic comparison .... – marc_s Mar 17 '19 at 10:00
  • 1
    I would expect the cost of SQL Server query execution and data transfer to be the same these cases. The disparity between the SQL query and ORM reflects the cost of client side ORM processing. You will likely see similar performance as the SQL query in your app code if you use a SqlCommand/Sqldatareader without the object mapping. – Dan Guzman Mar 17 '19 at 10:54
  • Maybe helpful: https://stackoverflow.com/a/43590624/5779732 – Amit Joshi Mar 18 '19 at 07:01

2 Answers2

6

Code using EF exhibiting performance issues will not magically run faster with Dapper or ADO + Sprocs. To get to the bottom of the performance problem you need to investigate and eliminate the causes of the performance issues.

At the top level, these performance issues stem from two core problems.

  • Loading too much data.
  • Loading data too often.

The key things I look out for: (as a start, there are a lot more items, but these are the big wins)

  1. Lazy Loading: This is where the code is loading from a set of related entities, but in doing so, the code is accessing these related entities after the initial load, causing each of these related entities to be loaded individually.

    • Hunting Method: Hook up an SQL profiler against a database that just your debugging instance of the app is running against. (I.e. a local DB)
    • Symptom: Seeing a whole lot of "SELECT TOP (1)..." queries after a main query to load a single entity, or collection.
    • Fix: The quick fix is to introduce eager loads (.Include()) to load these collections. The better fix is to use .Select() to just load the properties needed by the code in question.
  2. .ToList(): Misplaced .ToList() calls can cause huge performance headaches as systems mature because developers encountered a problem with EF that was solved by calling a .ToList. Ususally these appear when developers try calling a method inside a .Where() or .Select() expression. EF cannot understand these to pass an SQL equivalent, so adding the .ToList() converts it to Linq2Object and "ta-da" it works!

    • Hunting Method: Look for instances of .ToList() and flag any cases where you find a .ToList() before a .Select() or .Where() etc.
    • Symptom: Removing the extra .ToList() causes an EF error.
    • Fix: Check to see if the offending function has a DbFunctions equivalent. A common problem is working with DateTime functions which can be found in DbFunctions. In other cases, find the offending function and create a View Model for the expected data that is being selected, then create a property to run the function in the view model.
  3. Client-side pagination + entities: Another sin of development without proper sample data. Queries are written that effectively return all data without consideration to the total # of records. The data gets displayed client-side in a grid with pagination, which "works" but is really really slow. It worked fine when there were only 200 rows in the database, but now crawls with 50,000. (and will only get worse)

    • Hunting Method: Look at any API/Controller method that returns collections. Do these queries use .ToList() or .Skip()+.Take()? Do these methods return back entities, or view models?
    • Symptom: Paginated lists are really slow to load. Once loaded, switching pages is fast.
    • Fix: Calls from pagination controls should be modified to use server-side pagination. This means sending sorting, page size, and page # information to the server calls. This way the EF queries can be made more efficient to load just the # of rows needed to be displayed by the control. The code should also return view models for the search results which only present the columns displayed, and the keys needed to load a full entity on demand. (such as when a user clicks to open a record.) Entities can be heavyweight when you only need to display a handful of fields.
  4. Database indexing: Has the database been monitored and maintained? Are there indexes in place, and index maintenance? For SQL Server, is the database being backed up and having it's Tx Log shrunk? Code First implementations are rife with these issues where a system is spun up to life without any consideration to the backing database. As systems grow, no care is given to the database supporting it.

    • Hunting Method: Do you have a dedicated DBA that is looking after the database and giving the all-clear that it is performing up to scratch?
    • Symptoms: No DBA or scrutiny has been given to the database. The application is using GUID PKs for example which are set to NEWID() or Guid.New() without any index maintenance. The database has no indexes set. The Transaction Log (.LDF) is 15x larger than the database file (.MDF) etc. etc. etc.
    • Fix: Hire a DBA. If you are using GUID keys, switch over to NEWSEQUENTIALID() and establish some scheduled index maintenance jobs.
Steve Py
  • 26,149
  • 3
  • 25
  • 43
3

The performance you get in your application is influenced by many factors, but getting data out of a database basically can be split in 3:

  • query run time
  • data transfer time
  • client processing time

This being said, if your highest cost is in the database itself or in the data transfer, then switching from FE to Dapper or to a simple datareader makes no sense. However if your highest cost is in your client app (high concurrency/small payloads/lots of joins (linq)/multiple datasets/multipe columns to map), then switching from FE to dapper or to datareaders would make sense. However you need to understand that you will trade easiness of use for performance. In time critical apps this might make sense, in most of the real world cases, might not.

Dumitrescu Bogdan
  • 7,127
  • 2
  • 23
  • 31