0

Pandas gets ridiculously slow when loading more than 10 million records from a Teradata server using teradatasql and mainly the function pandas.read_sql(query,teradata_con). it takes 40-45 minutes to load 1-1.5 million records from teradata table.

sql_query = "select * from DB.TableName where columnname= 'values'"


df = pd.read_sql(sql_query, con_t)

I used chunksize option alse , but it doesnt reduce the execution time , only it loads data in chunks with same time . I tried to explore on IOPro package also, but didn't get much info on that. Is there any way to reduce the execution time ? cause, when I excute the same sql query directly in managment tool, it takes 1/3 rd time compare to pandas.

SKP
  • 151
  • 16

0 Answers0