The Problem
I would like to use the pandas to_sql to write a dataframe to a MYSQL table. However, my connection requires SSH.
What I have tried
I have a successful connection to execute queries with pymysql, but being able to directly use a function like to_sql would make my life a lot easier to be able to directly push the data like that. See below for my code that I'm working with.
from sshtunnel import SSHTunnelForwarder
import pymysql as db
import pandas as pd
import numpy as np
host = 'host'
localhost = 'localhost'
ssh_username = 'ssh_username'
private_key = '/path/'
# database variables
user='user'
password='password'
database='database'
#query function that works for pulling from database
def query(q):
with SSHTunnelForwarder(
(host, 22),
ssh_username=ssh_username,
ssh_private_key=private_key,
ssh_private_key_password="password",
remote_bind_address=(localhost, port)
) as server:
conn = db.connect(host=localhost,
port=server.local_bind_port,
user=user,
passwd=password,
db=database)
return pd.read_sql_query(q, conn)
# What you need to for to_sql
conn = db.connect(host=host,
port=port,
user=user,
password=password,
db=database)
# test df
np.random.seed(0)
number_of_samples = 10
frame = pd.DataFrame({
'feature1': np.random.random(number_of_samples),
'feature2': np.random.random(number_of_samples),
'class': np.random.binomial(2, 0.1, size=number_of_samples),
},columns=['feature1','feature2','class'])
# to_sql
frame.to_sql(con=conn, name='test_table', if_exists='replace', flavor='mysql')
Maybe Something Else?
I'm considering looking into turning a dataframe into a CSV file and then importing it into the database. Please let me know if you have any clue how to use something like to_sql with SSH.