0

I have a question, I hope you can help.

I have 2 MySQL servers. I have a Python job which extracts certain data from one server and dumps it into the other, periodically.

This is super slow; it's taking a lifetime to make the copy.

Can anyone help me to make it faster?

import mysql.connector
import pandas as pd
from sqlalchemy import create_engine
import multiprocessing as mp
from multiprocessing import cpu_count

try:
    engine_source = create_engine("connection string")

    iconn = engine_source.connect()

    data = pd.read_sql('SELECT QUERY ', iconn)

    data['load_id'] = '2'

    print('EXTRACT COMPLETE')

    engine_dest = create_engine("connection string 2")

    iconndest = engine_dest.connect()
    print(iconndest)
    
    data.to_sql(schema ='dbname', name='tablename', con=iconndest, if_exists = 'append', chunksize=1000, index= False)

    engine_source.dispose()
    engine_dest.dispose()

except Exception as e:
    print('extract: ' + str(e))
kikee1222
  • 1,866
  • 2
  • 23
  • 46
  • Does this answer your question? [How can I transfer data between 2 MySQL databases?](https://stackoverflow.com/questions/3242504/how-can-i-transfer-data-between-2-mysql-databases) – Zev Averbach Jul 07 '22 at 09:03
  • Are you constrained by RAM? I'm no expert on *pandas* but it may be that the entire resultset is being loaded into memory – DarkKnight Jul 07 '22 at 09:04
  • Thanks @zev I don't think so - they're entirely different servers In terms of RAM, I think it should be OK - we have a small result set (<1GB) and 16GB RAM and this is the only job it's used for# – kikee1222 Jul 07 '22 at 09:05
  • @kikee1222 The RAM in the servers is irrelevant. It's the RAM where the Python program is running that's important. Have you tried not using *pandas* ? – DarkKnight Jul 07 '22 at 09:09
  • Check does the servers (or at least one of them) supports [FEDERATED storage engine](https://dev.mysql.com/doc/refman/8.0/en/federated-storage-engine.html). If true then transfer the data immediately from one server to another, your program will only manage this process. – Akina Jul 07 '22 at 09:11

0 Answers0