2

I've hit a strange inconsistency problem with SQL Server inserts using a stored procedure. I'm calling a stored procedure from Python via pyodbc by running a loop to call it multiple times for inserting multiple rows in a table.

It seems to work normally most of the time, but after a while it will just stop working in the middle of the loop. At that point even if I try to call it just once via the code it doesn't insert anything. I don't get any error messages in the Python console and I actually get back the incremented identities for the table as though the data were actually inserted, but when I go look at the data, it isn't there.

If I call the stored procedure from within SQL Server Management Studio and pass in data, it inserts it and shows the incremented identity number as though the other records had been inserted even though they are not in the database.

It seems I reach a certain limit on the number of times I can call the stored procedure from Python and it just stops working.

I'm making sure to disconnect after I finish looping through the inserts and other stored procedures written in the same way and sent via the same database connection still work as usual.

I've tried restarting the computer with SQL Server and sometimes it will let me call the stored procedure from Python a few more times, but that eventually stops working as well.

I'm wondering if it is something to do with calling the stored procedure in a loop too quickly, but that doesn't explain why after restarting the computer, it doesn't allow any more inserts from the stored procedure.

I've done lots of searching online, but haven't found anything quite like this.

Here is the stored procedure:

USE [Test_Results]
GO

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

CREATE PROCEDURE [dbo].[insertStepData]
@TestCaseDataId int,
@StepNumber nchar(10),
@StepDateTime nvarchar(50)
AS
SET NOCOUNT ON;
BEGIN TRANSACTION
DECLARE @newStepId int


INSERT INTO TestStepData (
    TestCaseDataId,
    StepNumber,
    StepDateTime
     )
    VALUES (
    @TestCaseDataId,
    @StepNumber,
    @StepDateTime

)
SET @newStepId = SCOPE_IDENTITY();


SELECT @newStepId
FROM TestStepData

COMMIT TRANSACTION

Here is the method I use to call a stored procedure and get back the id number ('conn' is an active database connection via pyodbc):

def CallSqlServerStoredProc(self, conn, procName, *args):
    sql = """DECLARE @ret int
             EXEC @ret = %s %s
             SELECT @ret""" % (procName, ','.join(['?'] * len(args)))
    return int(conn.execute(sql, args).fetchone()[0])

Here is where I'm passing in the stored procedure to insert:

    ....
    for testStep in testStepData:  

        testStepId = self.CallSqlServerStoredProc(conn, "insertStepData", testCaseId, testStep["testStepNumber"], testStep["testStepDateTime"])

        conn.commit()
        time.sleep(1)

    ....
ivan_pozdeev
  • 33,874
  • 19
  • 107
  • 152
courtenayt
  • 23
  • 5

1 Answers1

0
SET @newStepId = SCOPE_IDENTITY();

SELECT @newStepId
FROM StepData

looks mighty suspicious to me:

  1. SCOPE_IDENTITY() returns numeric(38,0) which is larger than int. A conversion error may occur after some time. Update: now that we know the IDENTITY column is int, this is not an issue (SCOPE_IDENTITY() returns the last value inserted into that column in the current scope).

  2. SELECT into variable doesn't guarantee its value if more that one record is returned. Besides, I don't get the idea behind overwriting the identity value we already have. In addition to that, the number of values returned by the last statement is equal to the number of rows in that table which is increasing quickly - this is a likely cause of degradation. In brief, the last statement is not just useless, it's detrimental.

The 2nd statement also makes these statements misbehave:

EXEC @ret = %s %s
SELECT @ret

Since the function doesn't RETURN anything but SELECTs a single time, this chunk actually returns two data sets: 1) a single @newStepId value (from EXEC, yielded by the SELECT @newStepId <...>); 2) a single NULL (from SELECT @ret). fetchone() reads the 1st data set by default so you don't notice this but it doesn't work towards performance or correctness anyway.

Bottom line

Replace the 2nd statement with RETURN @newStepId.

Data not in the database problem

I believe it's caused by RETURN before COMMIT TRANSACTION. Make it the other way round.

In the original form, I believe it was caused by the long-working SELECT and/or possible side-effects from the SELECT not-to-a-variable being inside a transaction.

Community
  • 1
  • 1
ivan_pozdeev
  • 33,874
  • 19
  • 107
  • 152
  • Thank you. I was using the select statement to get the inserted identity of each record. Is there a better way to do that? – courtenayt Jul 22 '14 at 00:57
  • My method to call the stored procedure and get back the resulting identity id was from http://stackoverflow.com/questions/9915788/how-to-get-a-sql-server-stored-procedure-return-value-using-pyodbc – courtenayt Jul 22 '14 at 00:59
  • Thank you for the ideas. I'm trying them out. I wanted to post what changes I made, but I'm not sure why I can't get any formatted code to be added into these comments. I've tried indenting and bacticks, but no luck. – courtenayt Jul 22 '14 at 14:44
  • I wasn't sure if you were suggesting any other changes to the CallSqlServerStoredProc method other than changing `DECLARE @ret int` to `DECLARE @ret numeric(38,0)`. – courtenayt Jul 22 '14 at 14:48
  • I also modified the stored proc to change `DECLARE @newStepId int` to `DECLARE @newStepId numeric(38,0)` and removed the select statement at the bottom and replaced it with `RETURN @newStepId`. I re-ran the python code and it seems to have the same issue as before where I get the IDs back, but the data is not in the database, but I'm not sure if it is stuck in a bad state. I'm hesitant to truncate the table and loose all of my data. – courtenayt Jul 22 '14 at 14:56
  • You can only `RETURN` an `int` from a `PROCEDURE` (rather than `FUNCTION`), so your current type changes do no good. Make it like I described. – ivan_pozdeev Jul 22 '14 at 16:48
  • Thank you for the additional information. I'm afraid I'm not totally understanding how my stored procedure should be changed. Would you be willing to post the changes you describe above? – courtenayt Jul 22 '14 at 17:02
  • I've explained it in an update. Anyway, the type changes to `numeric` only fix an eventual `int` overflow. Let's first get them back to `int` and check that the rapid degradation problem is solved. – ivan_pozdeev Jul 23 '14 at 09:18
  • Btw, what is the type of the `IDENTITY` column in the `TestStepData` table? You never published its definition. If it's `int`, there's no need to change types to `numeric(38,0)`! – ivan_pozdeev Jul 23 '14 at 10:00
  • `IDENTITY` is an int in the table definition. Are you saying that I should then leave both `@ret` and `@newStepId` as `int` then? – courtenayt Jul 23 '14 at 15:01
  • I currently have both `@ret` and `@newStepId` as `numeric(38,0)` and I have `COMMIT_TRANSACTION` before the `RETURN @newStepId` and I've gotten over 6000 records in the table without any missing rows. Previously it was only getting to about 750 rows when I would first see missing rows in the table. From what I can tell, it seems to be fixed, but I wanted to check if you still would like me to test just using `int` for both variables as I mentioned in my previous comment. – courtenayt Jul 23 '14 at 15:06
  • I updated my original post to show my latest changes at the bottom of the post. Please let me know if you see any issues. – courtenayt Jul 23 '14 at 15:14
  • Yes, there's no need to convert variables to `numeric` and procedure to function. I've cleaned out the now-irrelevant parts from the answer and see no other issues. – ivan_pozdeev Jul 23 '14 at 17:03
  • Thank you so much for your help. It's been running reliably so far! – courtenayt Jul 23 '14 at 19:33