SET UP
MWE: I have a table in SQL Server as such
CREATE TABLE dbo.MyTable(
order_id INT IDENTITY(1,1),
column2 DECIMAL,
column3 INT
PRIMARY KEY(order_id)
)
I am using pyodbc to insert some data in the form of a pandas.DataFrame into the table. I am using data such as:
column2 column3
0 1.23 5
1 4.95 9
2 6.79 10
Where I've created this example dataframe using
data = pd.DataFrame({'column2':[1.23, 4.95, 6.79], 'column3':[5,9,10]})
I use the following statement to insert data
stmt = "INSERT INTO dbo.MyTable(column2, column3) OUTPUT Inserted.order_id VALUES (?, ?)"
ISSUE
This is the code that I use to insert everything and returning the values:
# Set up connection and create cursor
conn_string = "DRIVER={MyDriver};SERVER=MyServer;DATABASE=MyDb;UID=MyUID;PWD=MyPWD"
cnxn = pyodbc.connect(conn_string)
cnxn.autocommit = False
cursor = cnxn.cursor()
cursor.fast_executemany = True
# Upload data
cursor.executemany(stmt, data.values.tolist())
# Process the result
try:
first_result = cursor.fetchall()
except pyodbc.ProgrammingError:
first_result = None
result_sets = []
while cursor.nextset():
result_sets.append(cursor.fetchall())
all_inserted_ids = np.array(result_sets).flatten()
However, I do not get all the ids that I should get! For instance, suppose there is no data in the table, I will not get
all_inserted_ids = np.array([1, 2, 3])
But rather I will only get
all_inserted_ids = np.array([2, 3])
Which means that I'm losing the first id somewhere!
And notice that first_result
never works. It always throws the following:
pyodbc.ProgrammingError: No results. Previous SQL was not a query.
I've also tried using cursor.fetchone()
, cursor.fetchone()[0]
or cursor.fetchval()
but they got me the same error.
METHODS THAT I TRIED BUT DID NOT WORK
1) Adding "SET NOCOUNT ON"
I tried using the same code as in the question but with
stmt =
"""
SET NOCOUNT ON;
INSERT INTO dbo.MyTable(column2, column3)
OUTPUT Inserted.order_id
VALUES (?, ?)
"""
The output was [1, 2]
so I was missing 3
.
2) Adding "SET NOCOUNT ON" and inserting output to table variable
I used the following statement:
stmt =
"""
SET NOCOUNT ON;
DECLARE @NEWID TABLE(ID INT);
INSERT INTO dbo.MyTable(column2, column3)
OUTPUT Inserted.order_id INTO @NEWID(ID)
VALUES (?, ?)
SELECT ID FROM @NEWID
"""
Again this didn't work as I obtained only '[2, 3]' but no '1'.
3) Selecting the @@IDENTITY
I used the following statement:
stmt =
"""
INSERT INTO dbo.MyTable(column2, column3)
OUTPUT Inserted.order_id
VALUES (?, ?)
SELECT @@IDENTITY
"""
But it didn't work as I obtained array([Decimal('1'), 2, Decimal('2'), 3, Decimal('3')]
4) Selecting @@IDENTITY with SET NOCOUNT ON
I used
stmt =
"""
SET NOCOUNT ON
INSERT INTO dbo.MyTable(column2, column3)
OUTPUT Inserted.order_id
VALUES (?, ?);
SELECT @@IDENTITY
"""
but I got array([Decimal('1'), 2, Decimal('2'), 3, Decimal('3')], dtype=object)
again.
5) Selecting @@IDENTITY without using OUTPUT
I used:
stmt =
"""
INSERT INTO dbo.MyTable(column2, column3)
VALUES (?, ?);
SELECT @@IDENTITY
"""
But I got [Decimal('2') Decimal('3')]
6) Selecting @@IDENTITY without using OUTPUT but with SET NOCOUNT ON
I used:
stmt =
"""
SET NOCOUNT ON
INSERT INTO dbo.MyTable(column2, column3)
VALUES (?, ?);
SELECT @@IDENTITY
"""
But again I got: [Decimal('2') Decimal('3')]
A possible way around this, which is really bad, but does the job
A possible way is to create a new table where we'll store the ids and truncate it once we're done. It is horrible but I couldn't find any other solution..
Create a table:
CREATE TABLE NEWID(
ID INT
PRIMARY KEY (ID)
)
Next this is the complete code:
import pyodbc
import pandas as pd
import numpy as np
# Connect
conn_string = """
DRIVER={MYDRIVER};
SERVER=MYSERVER;
DATABASE=DB;
UID=USER;
PWD=PWD
"""
cnxn = pyodbc.connect(conn_string)
cnxn.autocommit = False
cursor = cnxn.cursor()
cursor.fast_executemany = True
# Data, Statement, Execution
data = pd.DataFrame({'column2': [1.23, 4.95, 6.79], 'column3': [5, 9, 10]})
stmt = """
INSERT INTO dbo.MyTable(column2, column3)
OUTPUT Inserted.order_id INTO NEWID(ID)
VALUES (?, ?);
"""
cursor.executemany(stmt, data.values.tolist())
cursor.execute("SELECT ID FROM NEWID;")
# Get stuff
try:
first_result = cursor.fetchall()
except pyodbc.ProgrammingError:
first_result = None
result_sets = []
while cursor.nextset():
result_sets.append(cursor.fetchall())
all_inserted_ids = np.array(result_sets).flatten()
print('First result: ', first_result)
print('All IDs: ', all_inserted_ids)
cursor.commit()
# Remember to truncate the table for next use
cursor.execute("TRUNCATE TABLE dbo.NEWID;", [])
cursor.commit()
This will return
First result: [(1, ), (2, ), (3, )]
All IDs: []
So we just keep the first result.