pandas

Asked Dec 11 '19 at 09:06

Active Dec 11 '19 at 09:48

Viewed 207 times

I have an application which do mysql reads intensively and usually it needs to read lots of data from mysql(e.g. ~100K rows one time); but i found it's very slow, especially when read it to pandas. here is what i tried:

option-1: pure sql+Sequel=~38ms

select all the 100K rows using pure sql

select * FROM thetable

in a db workbench(Sequel) which cost about 38ms;

option-2: MySQLdb connector + fetchall() = ~1.6s

option-3: MySQLdb connector + pandas.read_sql(sql, conn) = ~1.7s

option-3: pymysql connector + pandas.read_sql(sql, conn) = ~4.3s
It seems that use MySQLdb rather than pymysql can speed about 3X, but how can I achieve the performance as pure SQL does?

edited Dec 11 '19 at 09:48

Umar.H

22,559
7
39
74

asked Dec 11 '19 at 09:06

dingx

1,621
3
20
38

the most fast way to select/load data from mysql to memory/pandas

0 Answers0