I am going to read data from a table in my oracle database and fetch it in a data frame in python. The table has 22 million records and using fetchall() takes a long time without any result. (the query runs in oracle in 1 second)
I have tried using slicing the data with below code, but still it is not efficient.
import cx_Oracle
import pandas as pd
from pandas import DataFrame
connect_serv = cx_Oracle.connect(user='', password='', dsn='')
cur = connect_serv.cursor()
table_row_count=22242387;
batch_size=100000;
sql="""select t.* from (select a.*,ROW_NUMBER() OVER (ORDER BY column1 ) as row_num from table1 a) T where t.row_num between :LOWER_BOUND and :UPPER_BOUND"""
data=[]
for lower_bound in range (0,table_row_count,batch_size):
cur.execute(sql,{'LOWER_BOUND':lower_bound,
'UPPER_BOUND':lower_bound + batch_size - 1})
for row in cur.fetchall():
data.append(row)
I would like to know what is the proper solution to fetch this amount of data in python in a reasonable time.