Best way to handle insert/update process for application with high transaction of data?

Question

I have been working on my single page application for the past few months and while trying to improve some things on the back end I started digging more into my process of Insert/Update and database in general. I use SQL Server 2008 and database is designed using Primary Keys and Foreign Keys as well as Indexes in order to make strong and reliable connection between the tables. I'm talking about these elements since our current system has a very poor designed database. We have a lot of problems with deadlocks, and slow performance. Our system is state wide product that has a large transaction of data especially in time periods like begging or end of the school year. However my new application will should expect the same or even bigger translations in the future so my database design and Insert/Update processes have to be improved and prevent the deadlocks. Question that I have is about SQL Server standards that should be used in this kind of systems. For example I recently started looking more into stored procedures and seems that a lot of DBA experts are recommending going that way in situations where same query is used over and over. That should prevent some security risks as well as improve the efficiency. Here is example on how I currently handle Insert and Update transaction in my system:

<cftransaction action="begin">
    <cftry>
        <cfquery name="qrySaveDictionary" datasource="#dsn#">
            DECLARE @Status BIT = <cfqueryparam cfsqltype="cf_sql_bit" value="#trim(arguments.frm_status)#">;
            DECLARE @Name VARCHAR(50) = <cfqueryparam cfsqltype="cf_sql_varchar" maxlength="50" value="#trim(arguments.frm_name)#">;
            DECLARE @Code CHAR(2) = UPPER(<cfqueryparam cfsqltype="cf_sql_char" maxlength="2" value="#trim(arguments.frm_code)#">);
            DECLARE @ActionDate DATETIME = CURRENT_TIMESTAMP;
            DECLARE @ActionID UNIQUEIDENTIFIER = <cfqueryparam cfsqltype="cf_sql_idstamp" value="#SESSION.AccountID#">;

            <cfif len(trim(arguments.frm_recordid))>
                DECLARE @RecordID INT = <cfqueryparam cfsqltype="cf_sql_integer" value="#trim(arguments.frm_recordid)#">;
                IF EXISTS(SELECT RecID FROM Dictionary WHERE RecID = @RecordID)
                BEGIN
                    UPDATE Dictionary
                    SET
                        Status = @Status,
                        Name = @Name,
                        Code = @Code,
                        ActionDt = @ActionDate,
                        ActionID = @ActionID
                    WHERE RecID = @RecordID
                    SELECT @RecordID AS RecID
                END
            <cfelse>
                BEGIN
                    IF NOT EXISTS(SELECT 1 FROM Dictionary WHERE Code = @Code)
                    INSERT INTO Dictionary (
                        Status,Name,Code,ActionDt,ActionID
                    ) VALUES (
                        @Status,@Name,@Code,@ActionDate,@ActionID
                    )
                    SELECT SCOPE_IDENTITY() AS RecID
                END
            </cfif>
        </cfquery>

        <cfset local.fnResults = {status : "200", message : "Record successully saved!", RecID : qrySaveDictionary.RecID}>

        <cfcatch type="any">
            <cftransaction action="rollback" />
                <cfset local.fnResults = {status : "400", class : "alert-danger", message : "Error! Please contact your administrator."}>
            </cfcatch>
        </cftry>
    </cftransaction>

In example above I use ColdFusion to check for existing record that is passed in the function argument. Also I use transaction in ColdFusion to prevent bad data in case of some error situation. Also I'm declaring all arguments inside of the query and then running Update or Insert statement. I read a lot of blog about IF EXIST and IF NOT EXIST and some people say that they are bad for performance and can cause deadlocks. My question is what is the best approach for this kind of statements:

1) Something like this:

begin tran
if exists (select * from table with (updlock,serializable) where key = @key)
begin
   update table set ...
   where key = @key
end
else
begin
   insert into table (key, ...)
   values (@key, ...)
end
commit tran

Or

2)

begin tran
   update table with (serializable) set ...
   where key = @key

   if @@rowcount = 0
   begin
      insert into table (key, ...) values (@key,..)
   end
commit tran

Or

3) Maybe use MERGE that I did not find a good example of. Also I'm wondering should this be a stored procedure function? Again I'm working by myself on this project, we do not have DBA in the company and my resources are limited. I'm trying to research on the internet and get the information about SQL Server and good practices. This project is heavy on data transactions as I mentioned and it's very important to me that I choose right approach. If anyone can provide some thoughts, information or examples that would be greatly appreciated.

What is the structure of the database? Could you explain why you need to check and update if data exist? — Adinugraha Tawaqal, Aug 13 '18 at 03:22
There's never a substitute for testing. I expect the difference between #1 and #2 to be small enough that it doesn't matter, if your database is plagued with other issues as well. I'd prefer #2 simply because it executes fewer statements while taking out the same amount of locks. `MERGE` should not be used in a trivial case like this. Rather than worry about this, though, you should probably look at actually investigating the deadlocks (with traces) and how to resolve them specifically. Just changing stuff in the hope they'll go away takes much longer. — Jeroen Mostert, Aug 13 '18 at 07:33
We tried 3 different approaches and timed each one: 1) Pass in IDs via 'WHERE column IN ()' clause. We found that CFQUERYPARAM only excepts a limited number of IDs, using 'list="yes"'. So we used a loop every N number of IDs. Or you could just get rid of the CFQUERYPARAM, if you take alternative precautions against SQL injection. 2) Use CFSTOREDPROC, which accepts multiple IDs. 3) Loop over a CFQUERY stored procedure using EXECUTE. Surprisingly, the third option was the fastest, because it seems like Coldfusion keeps the DB connection open, during the loop phase. — Charles Robertson, Aug 13 '18 at 09:19
The slowest method by far, is looping over a normal CFQUERY tag, which I would strongly advise against. — Charles Robertson, Aug 13 '18 at 09:19
@AdinugrahaTawaqal I'm not sure that I understand your question. If user submits the form I either have to insert the record or update existing record. All depends if `frm_recordid` exist. — espresso_coffee, Aug 13 '18 at 11:28
If user open existing data in form then you need to load data and system will know it will need to update. If user click new button then the system know it will insert the data. — Adinugraha Tawaqal, Aug 13 '18 at 11:31
@JeroenMostert I'm developing new system and there is no way to test these kind of things without load. I'm just trying to use tested/good ways for this kind of situations. I know what we currently use in our existing system and what kind of problems we have. I can tell you that we have 3 different databases and every time we need to pull record for a customer from database A where we store basic info like first last name etc. then I need to join that table in database B in order to pull billing info for example. In my opinion that is poor database stricture and everything should be in one DB. — espresso_coffee, Aug 13 '18 at 11:31
@AdinugrahaTawaqal I'm still confused how that is related to my question? Can you please explain. — espresso_coffee, Aug 13 '18 at 11:32
@CharlesRobertson I'm aware of some of the issues that you mentioned above. Limit does exist for `lists` and I believe is around 1000 records. Also I was fixing some code in existing system where they had `cfquery` inside of the `cfloop` and should be other way around like this `Insert or Update` that way database connection stays open until Insert/Update is completed. I never used stored procedure for this purpose and I read that might be the best option. Is there a chance that you can provide an example for this question? I would like to see SP solution if possible. — espresso_coffee, Aug 13 '18 at 11:36
Yes. That is another way, to put the loop inside the CFQUERY, but, as I said, we were surprised by the result of looping oustide the CFQUERY, but only with an EXECUTE SP. I will try and dig out the code. It was a few years ago... — Charles Robertson, Aug 13 '18 at 11:40
If your form knows of user are trying to insert new data vs updating existing data then you don't really need to check when submit. — Adinugraha Tawaqal, Aug 13 '18 at 11:41
When you presented the update form to user it needs to load existing data. This is different when you are trying to insert new data. — Adinugraha Tawaqal, Aug 13 '18 at 11:43
@AdinugrahaTawaqal If you want to update existing data you have to use unique ID for that. That is the only reliable factor for that purpose. On the back end I have to check if that unique id exist. I can't rely on the front end without validation on the back end. That would be security issue as well as I might end up with corrupted database. — espresso_coffee, Aug 13 '18 at 11:46
You simply need to create 2 different function for front end to call. This to function is to insert and to update just like SQL language itself that differentiates between insert and update. Function should not be ambiguous. — Adinugraha Tawaqal, Aug 13 '18 at 11:49
@Adinugraha Tawaqal Can you please explain who that would be different that the code that I currently use? As you can see I have one function but checking I'd then call update or insert? Also how that will help with deadlocks and improve efficiency? — espresso_coffee, Aug 13 '18 at 11:52
Your current function is ambiguous because it sometimes do insert and some other time do update. — Adinugraha Tawaqal, Aug 13 '18 at 12:04
see why you should not read while write : Reader-Writer Deadlocks https://www.red-gate.com/simple-talk/sql/performance/sql-server-deadlocks-by-example/ — Adinugraha Tawaqal, Aug 13 '18 at 12:05
Look up how to do a MERGE statement. This Exist / not exist is for too verbose. — James A Mohler, Aug 13 '18 at 15:13
@JamesAMohler So you are recommending MERGE for this type of transaction? I'm kind of confused since I see so many different opinions on how to handle this type of problem. — espresso_coffee, Aug 13 '18 at 15:18
Here is my take on it. MERGE lets SQL Server figure out the best way to do the operation. I trust it more than building if/else SQL statements. — James A Mohler, Aug 13 '18 at 18:23
@espresso_coffee - You mentioned deadlocks, but didn't say whether you'd identified, with absolute certainty, which statements were deadlocking - and what indexes exist on the table(s) involved. Rather than "guessing" what *might* fix the issue, you should trace the deadlocks to identify what is causing them first.... Only then can you determine how to resolve them. — SOS, Aug 13 '18 at 19:03
@Ageax I'm pretty sure that I elaborated in one of my previous answers. I'm creating new system, as of now I can't say if I will have deadlocks or now. I'm looking for good design and coding. My existing system has some of these problems but that is not the topic of this discussion. — espresso_coffee, Aug 13 '18 at 19:05
I don't recall that thread (or what indexes existed on the tables - which is important when it comes to db locking), which is why I asked. IMO the two things are related. There's no single method that'll eliminate deadlocks completely. Tracing deadlocks in the old system would provide valuable, real world details about what to avoid in the new system (query wise, lack of indexes, etc..), rather than hoping practice x or y will do the trick.. But that's up to you. Good luck. — SOS, Aug 13 '18 at 19:18
@Ageax I might be wrong and I'm not DBA expert but there is definitely a better way to approach this kind of transactions. That's why I asked this question. Thanks for the feedback. — espresso_coffee, Aug 13 '18 at 19:20
@espresso_coffee - While important, deadlocks are more involved than *only* how you structure the insert/update. Other statements, even ones without a TRANS, can still cause deadlocks. A legacy app I worked on had no transactions - of any kind - yet kept getting deadlocked on a simple SELECT. Turns out the SELECT was conflicting with a single record UPDATE due to poor execution plan. Adding a covering index resolved it, but every situation is different... This article gives a good explanation of some things that can happen... https://www.mssqltips.com/sqlserverauthor/272/viacheslav-maliutin/ — SOS, Aug 16 '18 at 17:34
@Ageax I have read the article from the link you shared. They talk about deadlocks and how to prevent them. One of the things that I found confusing was this part `Solve the SQL Server Deadlock with a Clustered Foreign Key`. From my understanding about SQL database design there can be only one clustered index in each table. I already have Primary Key with cluster id for example, how I can set another clustered index on my `FK` in that case? Is that correct design and approach? — espresso_coffee, Aug 16 '18 at 20:03

score 0 · Answer 1 · answered Aug 13 '18 at 12:15

You should not guess what you user would like to do. Your system should constrict/limit/validate user action with different button/form/function for each user action.

And also reading while writing introduce Reader-Writer Deadlocks https://www.red-gate.com/simple-talk/sql/performance/sql-server-deadlocks-by-example/.

All of this is why i suggest you need to split the qrySaveDictionary into 2 :

<cftransaction action="begin">
    <cftry>
        <cfquery name="qryInsertDictionary" datasource="#dsn#">
            DECLARE @Status BIT = <cfqueryparam cfsqltype="cf_sql_bit" value="#trim(arguments.frm_status)#">;
            DECLARE @Name VARCHAR(50) = <cfqueryparam cfsqltype="cf_sql_varchar" maxlength="50" value="#trim(arguments.frm_name)#">;
            DECLARE @Code CHAR(2) = UPPER(<cfqueryparam cfsqltype="cf_sql_char" maxlength="2" value="#trim(arguments.frm_code)#">);
            DECLARE @ActionDate DATETIME = CURRENT_TIMESTAMP;
            DECLARE @ActionID UNIQUEIDENTIFIER = <cfqueryparam cfsqltype="cf_sql_idstamp" value="#SESSION.AccountID#">;

                BEGIN
                     INSERT INTO Dictionary (
                        Status,Name,Code,ActionDt,ActionID
                    ) VALUES (
                        @Status,@Name,@Code,@ActionDate,@ActionID
                    )
                    SELECT SCOPE_IDENTITY() AS RecID
                END
        </cfquery>

        <cfset local.fnResults = {status : "200", message : "Record successully saved!", RecID : qrySaveDictionary.RecID}>

        <cfcatch type="any">
            <cftransaction action="rollback" />
                <cfset local.fnResults = {status : "400", class : "alert-danger", message : "Error! Please contact your administrator."}>
            </cfcatch>
        </cftry>
    </cftransaction>

and

<cftransaction action="begin">
    <cftry>
        <cfquery name="qryUpdateDictionary" datasource="#dsn#">
            DECLARE @Status BIT = <cfqueryparam cfsqltype="cf_sql_bit" value="#trim(arguments.frm_status)#">;
            DECLARE @Name VARCHAR(50) = <cfqueryparam cfsqltype="cf_sql_varchar" maxlength="50" value="#trim(arguments.frm_name)#">;
            DECLARE @Code CHAR(2) = UPPER(<cfqueryparam cfsqltype="cf_sql_char" maxlength="2" value="#trim(arguments.frm_code)#">);
            DECLARE @ActionDate DATETIME = CURRENT_TIMESTAMP;
            DECLARE @ActionID UNIQUEIDENTIFIER = <cfqueryparam cfsqltype="cf_sql_idstamp" value="#SESSION.AccountID#">;

                BEGIN
                    UPDATE Dictionary
                    SET
                        Status = @Status,
                        Name = @Name,
                        Code = @Code,
                        ActionDt = @ActionDate,
                        ActionID = @ActionID
                    WHERE RecID = @RecordID
                    SELECT @RecordID AS RecID
                END

        </cfquery>

        <cfset local.fnResults = {status : "200", message : "Record successully saved!", RecID : qrySaveDictionary.RecID}>

        <cfcatch type="any">
            <cftransaction action="rollback" />
                <cfset local.fnResults = {status : "400", class : "alert-danger", message : "Error! Please contact your administrator."}>
            </cfcatch>
        </cftry>
    </cftransaction>

And let your front end decide when to insert or update as the user choose their action.

If I split this in two functions you are saying that deadlocks or database blocks should be prevented? I'm reading the article so trying to understand the concept at the same time. Also should I just split these two statement in two separate queries in the same function or have separate function for each process? — espresso_coffee, Aug 13 '18 at 12:22
I try to erase the if exist statement to prevent the deadlock. You need to seperate function for seperate process (insert and update). You need to try it to see if this fixes the deadlock. — Adinugraha Tawaqal, Aug 13 '18 at 12:24
From your answer I see that you do not use `IF EXIST`. Are they something that is not helping in this situation or it's causing deadlocks? — espresso_coffee, Aug 13 '18 at 13:07
You select the same data with 'IF EXIST' then try to update it. This is Reader-Writer Deadlocks — Adinugraha Tawaqal, Aug 13 '18 at 13:18
Would be a good option to put this in stored procedure? Also if I do that should I still have two functions like `cffunction insert` and `cffunction update` and have stored procedures in those two? — espresso_coffee, Aug 13 '18 at 13:20
When I suggest to use two functions is not to split the function then combine them in another function that calls both based on condition. You simply need to know when to use insert or update function based on your UI. — Adinugraha Tawaqal, Aug 13 '18 at 13:23
Using query or Stored procedure is just the same if you use transaction, It has the same risk of deadlock. — Adinugraha Tawaqal, Aug 13 '18 at 13:24
Is there any benefit of using stored procedure for this kind of transaction? — espresso_coffee, Aug 13 '18 at 13:26
Hmm not that I know of. In regards to the possibility of deadlock. — Adinugraha Tawaqal, Aug 13 '18 at 13:32

Best way to handle insert/update process for application with high transaction of data?

1 Answers1