1

I'm trying to insert records from my dataframe on a existing table that's stored on a Hive cluster. I cannot use sqlalchemy engine since I am using a JDBC connection.

My code is:

cursor = conn.cursor()
engine = sqlalchemy.create_engine('driver://', creator=conn)
dff = pd.DataFrame([[1, 2], ['process 1', 'process 2']], columns=['id', 'description'])
sql_query = "INSERT INTO default.my_table SELECT * FROM {0}".format(dff)
#dff.to_sql(name='my_table', schema='default', con=engine.connect(), if_exists='append')#I also tried this but it gives me error: 'Connection' object is not callable
cursor.execute(sql_query)

I prefer to user INSERT INTO SELECT since INSERT INTO VALUES because the schema from my data can change over on future, and with this dependency, I just need to change my database tables.

With my current code I got this:

CAUSED BY: Exception: Syntax error

How can I solve my problem?

1 Answers1

1

dff is an object, not string. You have to something like this cursor.executemany(your_sql_statement, column_1, column_2, column_n)

MertG
  • 753
  • 1
  • 6
  • 22
  • I am trying this: cursor.executemany("INSERT INTO myTable" + " ([id],[description]) values (?,?)", dff.values.tolist()) but java.sql.SQLException: Method not supported –  May 02 '20 at 17:52