loading teradata table using pandas taking so much time

Asked May 29 '20 at 08:03

Active Aug 27 '20 at 00:05

Viewed 161 times

Pandas gets ridiculously slow when loading more than 10 million records from a Teradata server using teradatasql and mainly the function pandas.read_sql(query,teradata_con). it takes 40-45 minutes to load 1-1.5 million records from teradata table.

sql_query = "select * from DB.TableName where columnname= 'values'"


df = pd.read_sql(sql_query, con_t)

I used chunksize option alse , but it doesnt reduce the execution time , only it loads data in chunks with same time . I tried to explore on IOPro package also, but didn't get much info on that. Is there any way to reduce the execution time ? cause, when I excute the same sql query directly in managment tool, it takes 1/3 rd time compare to pandas.

edited Aug 27 '20 at 00:05

asked May 29 '20 at 08:03

SKP

how long does it take outside of pandas? – Paul H Aug 27 '20 at 00:06
@Paul, hardly 5 min in SAS and 7 min in management tool – SKP Aug 27 '20 at 00:11

loading teradata table using pandas taking so much time

0 Answers0