4

The Problem

I would like to use the pandas to_sql to write a dataframe to a MYSQL table. However, my connection requires SSH.

What I have tried

I have a successful connection to execute queries with pymysql, but being able to directly use a function like to_sql would make my life a lot easier to be able to directly push the data like that. See below for my code that I'm working with.

from sshtunnel import SSHTunnelForwarder
import pymysql as db
import pandas as pd
import numpy as np

host = 'host'
localhost = 'localhost'
ssh_username = 'ssh_username'
private_key = '/path/'

# database variables
user='user'
password='password'
database='database'

#query function that works for pulling from database
def query(q):
    with SSHTunnelForwarder(
        (host, 22),
        ssh_username=ssh_username,
        ssh_private_key=private_key,
        ssh_private_key_password="password",
        remote_bind_address=(localhost, port)
    ) as server:
        conn = db.connect(host=localhost,
                               port=server.local_bind_port,
                               user=user,
                               passwd=password,
                               db=database)

        return pd.read_sql_query(q, conn)

# What you need to for to_sql
 conn = db.connect(host=host,
                        port=port,
                        user=user, 
                        password=password,  
                        db=database)

# test df
np.random.seed(0)
number_of_samples = 10
frame = pd.DataFrame({
'feature1': np.random.random(number_of_samples),
'feature2': np.random.random(number_of_samples),
'class':    np.random.binomial(2, 0.1, size=number_of_samples),
},columns=['feature1','feature2','class'])

# to_sql
frame.to_sql(con=conn, name='test_table', if_exists='replace', flavor='mysql')

Maybe Something Else?

I'm considering looking into turning a dataframe into a CSV file and then importing it into the database. Please let me know if you have any clue how to use something like to_sql with SSH.

Paroofkey
  • 199
  • 1
  • 11
  • why does your connection require `ssh` to work, should be able to access your db by using the ip of your server – gold_cy Jan 24 '19 at 12:32
  • 1
    We use SSH for security reasons. I talked to our main architect yesterday and informed me that we have to connect via SSH in some way. _However_ I'm looking into [local port forwarding](https://serversforhackers.com/c/ssh-tricks) or a work around like this, but I think our admin has this blocked too for now. At the moment I have about developed a solution to convert it to csv. Not what I was looking for, but it will get the job done :/ – Paroofkey Jan 25 '19 at 09:57

1 Answers1

4

I ended up using local port forwarding to solve this problem.

This is what I used in the terminal for local port forwarding:

ssh -N -v SSH_user@SSH_host -L3306:127.0.0.1:3306

I used sqlalchemy for the connection:

from sqlalchemy import create_engine

engine = create_engine("mysql://user:password@127.0.0.1:3306/db?charset=utf8"
df.to_sql(con=engine, name='test_table', if_exists='replace')
Paroofkey
  • 199
  • 1
  • 11