I am using the oracledb
Python package to query a table in an Oracle database. When calling cursor.fetchall()
(or cursor.fetchmany()
in batches), the cursor.rowcount
attribute is higher than the actual number of rows returned. However, this issue does not appear when using the same methods in cx_Oracle
.
For example, when selecting all records from a table with 93 rows, cursor.rowcount
is 96, whereas I would expect it to be 93 to match the number of rows in the table since the docs describe it like this:
This read-only attribute specifies the number of rows that have currently been fetched from the cursor (for select statements)...
Versions:
Python: 3.8.10
oracledb: 1.2.2
cx_Oracle: 8.3.0
Oracle database: 19c (19.12.0.0.0)
Full example
Create example table:
-- Create a table of ids from 1 to 93
create table example_table as
select
rownum as id
from
dual
connect by rownum <= 93
Using both cursor.fetchmany()
in batches and cursor.fetchall()
:
import oracledb
SQL = "select id from example_table"
un = 'abc'
pw = 'def'
cs = 'ghi'
with oracledb.connect(user=un, password=pw, dsn=cs) as connection:
with connection.cursor() as cursor:
cursor.prefetchrows = 0
cursor.arraysize = 10
cursor.execute(SQL)
print(f"After execute() call: {cursor.rowcount=}")
batch = 0
while True:
rows = cursor.fetchmany()
if not rows:
break
print(f"Batch {batch}: {cursor.rowcount=}")
batch += 1
rows = cursor.fetchall()
print(f"After fetchall() call: {cursor.rowcount=}")
Produces this:
After execute() call: cursor.rowcount=0
Batch 0: cursor.rowcount=20
Batch 1: cursor.rowcount=30
Batch 2: cursor.rowcount=40
Batch 3: cursor.rowcount=50
Batch 4: cursor.rowcount=60
Batch 5: cursor.rowcount=70
Batch 6: cursor.rowcount=80
Batch 7: cursor.rowcount=90
Batch 8: cursor.rowcount=100
Batch 9: cursor.rowcount=96
After fetchall() call: cursor.rowcount=96
Note that the final values are the same if just calling fetchall()
(i.e. removing the fetchmany()
calls):
After execute() call: cursor.rowcount=0
After fetchall() call: cursor.rowcount=96
The key point is that cursor.rowcount
is 96, where I'd expect it to be 93. The interesting parts to me are:
- After the first batch,
rowcount
is already 20 - I would have expected it to be 10 here - After batch 8,
rowcount
is 100 - it has 'overshot' the actual table row count - which seems to indicaterowcount
does not necessarily reflect the number of rows actually fetched, but the number of rows attempted to be fetched, at least initially - After the final batch (and also after the
fetchall()
call),rowcount
is 96 - I would expect it to be 93 here- From testing with other values for
arraysize
, it seems to follow the patternr = t + (t mod a)
wherer = rowcount
,t = total table rows
, anda = arraysize
- To illustrate, another example on the same table using
arraysize = 20
:
After execute() call: cursor.rowcount=0 After fetchall() call: cursor.rowcount=106
- From testing with other values for
cx_Oracle
comparison
Now testing using the same methods in cx_Oracle
- the only lines changed being:
cx_Oracle.init_oracle_client(lib_dir=r"C:\oracle\instantclient_21_7")
with cx_Oracle.connect(user=un, password=pw, dsn=cs) as connection:
This behaves how I expected:
After execute() call: cursor.rowcount=0
Batch 0: cursor.rowcount=10
Batch 1: cursor.rowcount=20
Batch 2: cursor.rowcount=30
Batch 3: cursor.rowcount=40
Batch 4: cursor.rowcount=50
Batch 5: cursor.rowcount=60
Batch 6: cursor.rowcount=70
Batch 7: cursor.rowcount=80
Batch 8: cursor.rowcount=90
Batch 9: cursor.rowcount=93
After fetchall() call: cursor.rowcount=93
Summary
I originally wondered if the issue might be caused by cursor.prefetchrows
, which is why I set that to 0, but that had no effect.
I wonder if it is related to how these methods are using arraysize
to make reads to the database.
I'm still not sure what I'm missing here though. What is causing this behaviour, and how can I resolve it?