0

Although there are many solutions to export the mysql tables to csv using python. I want to know the best way of doing that? Currently I am storing around 50 tables to csv which takes around 47 minutes and also takes more than 16gb of memory.

The code is :

sqlEngine = create_engine(f'mysql+pymysql://{MYSQL_READER_USERNAME}:%s@{MYSQL_READER_HOST}/{MYSQL_DB_NAME}' % urllib.parse.quote(f'{MYSQL_READER_PASSWORD}'), pool_recycle=3600)

def export_table(name, download_location):
    table = pd.read_sql(f'select /*+ MAX_EXECUTION_TIME(100000000) */ * from {name}', sqlEngine)
    table.to_csv(os.path.join(download_location, name + '.csv'), index=False)

tables = ['table1', ... , 'table50']

for table in tqdm(tables):
    print(f'\t => \t Storing {table}')
    export_table(table, store_dir)

I have seen many methods to store to csv like:

  • using Cursor
  • using pyodbc library
  • pandas read sql method.

Is there any other method or technique and which one is best to reduce memory or execution time ?

Asad Khalil
  • 45
  • 1
  • 10
  • 2
    definitely don't download it into a Pandas dataframe just to create a csv, it's a huge waste of time and memory. You can create a csv as part of your query, just add e.g. `INTO OUTFILE "whatever.csv" FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n';` – Josh Friedlander Dec 13 '21 at 13:00
  • @Josh Friedlander can you elaborate how can i achieve this? – Asad Khalil Dec 14 '21 at 05:28

0 Answers0