Create a dataframe with the right schema in the first place:
sql_df = df[['colA', 'colB']]
sql_df.to_sql(sTable, engine, if_exists='append')
Pandas ought to be pretty memory-efficient with this, meaning that the columns won't actually get duplicated, they'll just be referenced by sql_df
. You could even rename columns to make this work.
A super generic way to accomplish this might look like:
def save_to_sql(df, col_map, table, engine):
sql_df = pd.DataFrame()
for old_name, new_name in col_map:
sql_df[new_name] = df[old_name]
sql_df.to_sql(table, engine, if_exists='append')
Which takes the dataframe and a list that pairs which columns to use with what they should be called to make them line up with the SQL table. E.g., save_to_sql(my_df, [('colA', 'column_a'), ('colB', 'column_b')], 'table_name', sql_engine)