1

I have an application which do mysql reads intensively and usually it needs to read lots of data from mysql(e.g. ~100K rows one time); but i found it's very slow, especially when read it to pandas. here is what i tried:

option-1: pure sql+Sequel=~38ms

select all the 100K rows using pure sql

select * FROM thetable

in a db workbench(Sequel) which cost about 38ms;

option-2: MySQLdb connector + fetchall() = ~1.6s

option-3: MySQLdb connector + pandas.read_sql(sql, conn) = ~1.7s

option-3: pymysql connector + pandas.read_sql(sql, conn) = ~4.3s
It seems that use MySQLdb rather than pymysql can speed about 3X, but how can I achieve the performance as pure SQL does?

Umar.H
  • 22,559
  • 7
  • 39
  • 74
dingx
  • 1,621
  • 3
  • 20
  • 38

0 Answers0