It's my understanding that you could pull data from a database (in this case MySQL), load this data into a pandas df to manipulate it (add new columns, complex calculations not easily done in SQL, etc) then load the new columns back into MySQL. However, I am having trouble with the final step of adding the newly created columns back into the same existing table in MySQL where I pulled the original data. The only similar question I found is here, but I am missing a key component somewhere.
I am using pandas to_sql
to perform this task (is there a better option?). My code is pulling crypto data from my MySQL table, loading this into a pandas df and I am adding a new volatility column (I know you could calculate this in SQL, this is just an example).
SQL table (crypto)
date open_price close_price high_price low_price market_cap volume
1 2018-03-11 8852.78 9578.63 9711.89 8607.12 149,716,000,000 6,296,370,000
2 2018-03-10 9350.59 8866.00 9531.32 8828.47 158,119,000,000 5,386,320,000
3 2018-03-09 9414.69 9337.55 9466.35 8513.03 159,185,000,000 8,704,190,000
4 2018-03-08 9951.44 9395.01 10147.40 9335.87 168,241,000,000 7,186,090,000
pandas df
date open_price close_price high_price low_price market_cap volume Volatility
1 2018-03-11 8852.78 9578.63 9711.89 8607.12 149,716,000,000 6,296,370,000 .2
2 2018-03-10 9350.59 8866.00 9531.32 8828.47 158,119,000,000 5,386,320,000 .58
3 2018-03-09 9414.69 9337.55 9466.35 8513.03 159,185,000,000 8,704,190,000 .56
4 2018-03-08 9951.44 9395.01 10147.40 9335.87 168,241,000,000 7,186,090,000 .12
I first went into MySQL and added a new column using:
ALTER TABLE `crypto`
ADD COLUMN Volatility float NOT NULL;
The code below is what I used to first push data to MySQL. I changed if_exists
to append
but I am receiving an error message
df.to_sql(name='crypto', con=engine, if_exists = 'append', chunksize=10000)
DatabaseError: (mysql.connector.errors.DatabaseError) 1364 (HY000): Field 'Volitility_90' doesn't have a default value
How can I append just 1 new column to my MySQL table? Do I need to use another package and use cursor.execute()
. Any help with this code, or with some online tutorials would be helpful! I just can't seem to find a lot of content on this specific example, which has me second guessing this whole data manipulation strategy altogether.