17

I'm trying to store a mySQL query result in a pandas DataFrame using pymysql and am running into errors building the dataframe. Found a similar question here and here, but it looks like there are pymysql-specific errors being thrown:

import pandas as pd
import datetime
import pymysql

# dummy values 
connection = pymysql.connect(user='username', password='password', databse='database_name', host='host')

start_date = datetime.datetime(2017,11,15)
end_date = datetime.datetime(2017,11,16)

try:
    with connection.cursor() as cursor:
    query = "SELECT * FROM orders WHERE date_time BETWEEN %s AND %s"

    cursor.execute(query, (start_date, end_date)) 

    df = pd.DataFrame(data=cursor.fetchall(), index = None, columns = cursor.keys())
finally:
    connection.close()

returns: AttributeError: 'Cursor' object has no attribute 'keys'

If I drop the index and columns arguments:

try:
    with connection.cursor() as cursor:
    query = "SELECT * FROM orders WHERE date_time BETWEEN %s AND %s"

    cursor.execute(query, (start_date, end_date)) 

    df = pd.DataFrame(cursor.fetchall())
finally:
    connection.close()

returns ValueError: DataFrame constructor not properly called!

Thanks in advance!

msoderstrom
  • 537
  • 1
  • 7
  • 14

3 Answers3

38

Use Pandas.read_sql() for this:

query = "SELECT * FROM orders WHERE date_time BETWEEN ? AND ?"
df = pd.read_sql(query, connection,  params=(start_date, end_date))
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
  • 1
    pandas.read_sql() works well in general. However, what if the command is to execute a stored procedure which needs to update a table (and commit the updated rows of the table)? In that case, how to ensure the commit is done? Does pd.read_sql allow commits? – Nodame Jun 18 '19 at 21:45
  • 2
    @Nodame, you can first call your stored procedure using sqlalchemy and then read results using pd.read_sql – MaxU - stand with Ukraine Jun 18 '19 at 21:51
  • How does a pymysql conmection work with pandas read_sql since pymysql Connection doesnt inherit from any schlalchemy class? – Ludvig W Jan 31 '20 at 19:15
2

Try This:

import pandas as pd
import pymysql

mysql_connection = pymysql.connect(host='localhost', user='root', password='', db='test', charset='utf8')
                    
sql = "SELECT * FROM `brands`"
df = pd.read_sql(sql, mysql_connection, index_col='brand_id')
print(df)
1

Thank you for your suggestion to use pandas.read_sql(). It works with executing a stored procedure as well! I tested it in MSSQL 2017 environment.

Below is an example (I hope it helps others):

def database_query_to_df(connection, stored_proc, start_date, end_date):
    # Define a query
    query ="SET NOCOUNT ON; EXEC " + stored_proc + " ?, ? " + "; SET NOCOUNT OFF"

    # Pass the parameters to the query, execute it, and store the results in a data frame
    df = pd.read_sql(query, connection, params=(start_date, end_date))
    return df
Piotr Labunski
  • 1,638
  • 4
  • 19
  • 26