python dataframe to_sql with different schema

Question

I want to store a dataframe into an existing MSSQL table. Dataframe has 3 columns, but the SQL table has only 2.

How is it possible to store the 2 columns with the same name into SQL?

I tried the following code:

df.to_sql(sTable, engine, if_exists='append')

It works, if the number and names of the columns are exactly the same. But I want to make my code more generic.

score 0 · Answer 1 · answered Mar 28 '18 at 20:45

Create a dataframe with the right schema in the first place:

sql_df = df[['colA', 'colB']]
sql_df.to_sql(sTable, engine, if_exists='append')

Pandas ought to be pretty memory-efficient with this, meaning that the columns won't actually get duplicated, they'll just be referenced by sql_df. You could even rename columns to make this work.

A super generic way to accomplish this might look like:

def save_to_sql(df, col_map, table, engine):
    sql_df = pd.DataFrame()
    for old_name, new_name in col_map:
        sql_df[new_name] = df[old_name]
    sql_df.to_sql(table, engine, if_exists='append')

Which takes the dataframe and a list that pairs which columns to use with what they should be called to make them line up with the SQL table. E.g., save_to_sql(my_df, [('colA', 'column_a'), ('colB', 'column_b')], 'table_name', sql_engine)

score 0 · Answer 2 · answered Mar 29 '18 at 12:43

That's a good solution. Now, I'm also able to convert Header names to SQL-field-names. The only topic I have to solve is the idex. DataFrames do have an index (from 0...n). I don't Need the field in the DB. But, I did not found a way to skip the idex column by uplouding to the SQL DB. Does somebody has an idea?

python dataframe to_sql with different schema

2 Answers2