I had the same problem with in a slightly different context. I needed to query a +27000 rows table and it turns out that cx_Oracle cuts the connection to the DB after a while.
While a connection to the db is open, you can use the read() method of the cx_Oracle.Lob object to transform it into a string. But if the query brings a table that is too big, it won´t work because the connection will stop at some point and when you want to read the results from the query you´ll gt an error on the cx_Oracle objects.
I tried many things, like setting
connection.callTimeout = 0 (according to documentation, this means it would wait indefinetly), using fetchall() and then putting the results on a dataframe or numpy array but I could never read the cx_Oracle.Lob objects.
If I try to run the query using pandas.DataFrame.read_sql(query, connection) The dataframe would contain cx_Oracle.Lob objects with the connection closed, making them useless. (Again this only happens if the table is very big)
In the end I found a way of getting around this by querying and creating a csv file inmediatlely after, even though I know it´s not ideal.
def csv_from_sql(sql: str, path: str="dataframe.csv") -> bool:
try:
with cx_Oracle.connect(config.username, config.password, config.database, encoding=config.encoding) as connection:
connection.callTimeout = 0
data = pd.read_sql(sql, con=connection)
data.to_csv(path)
print("FILE CREATED")
except cx_Oracle.Error as error:
print(error)
return False
finally:
print("PROCESS ENDED\n")
return True
def make_query(sql: str, path: str="dataframe.csv") -> pd.DataFrame:
if csv_from_sql(sql, path):
dataframe = pd.read_csv("dataframe.csv")
return dataframe
return pd.DataFrame()
This took a long time (about 4 to 5 minutes) to bring my +27000-rows table, but it worked when everything else didn´t.
If anyone knows a better way, it would be helpful for me too.