0

I want to write data from an csv file. Everything works fine. If I run my script again I receive an error message:

IntegrityError: (mysql.connector.errors.IntegrityError) 1062 (23000): Duplicate entry 'AAPL' for key 'PRIMARY'

It is due to the fact that the entries are duplicate. How can I check now whether the data which I want to write into the database already exists and process it it exists? The csv file header looks like this

ticker isin product_name currency market_data_source trading_location country sector

and the data inside like this:

AAPL IE00B4BNMY34 Accenture plc USD Yahoo NYSE USA Computer Hardware

And the code like this:

import sqlalchemy as sqlal
import pandas as pd

#csv loader
csv_loader = pd.read_csv('C:/Test.csv', encoding='cp1252', sep=';', index_col=0).dropna()

#connection to mysql database
mysql_engine = sqlal.create_engine('mysql+mysqlconnector://xxx/financialanalysis')
mysql_engine.raw_connection()

metadata = sqlal.MetaData()

#------------------------------
#create table in mysql database
#------------------------------
product  = sqlal.Table('product', metadata,
                       sqlal.Column('ticker', sqlal.String(10), primary_key=True, nullable=False, unique=True),                   
                       sqlal.Column('isin', sqlal.String(12), nullable=True),
                       sqlal.Column('product_name', sqlal.String(80), nullable=True),
                       sqlal.Column('currency', sqlal.String(3), nullable=True),
                       sqlal.Column('market_data_source', sqlal.String(20), nullable=True),
                       sqlal.Column('trading_location', sqlal.String(20), nullable=True),
                       sqlal.Column('country', sqlal.String(20), nullable=True),
                       sqlal.Column('sector', sqlal.String(80), nullable=True),
                       )

metadata.create_all(mysql_engine) 

#Write the data into the mysql database
csv_loader = csv_loader.reset_index()
insert_product_data = product.insert().values(csv_loader.to_dict('records'))
mysql_engine.execute(insert_product_data)
MCM
  • 1,479
  • 2
  • 17
  • 22
  • Have you considered using an SQL Merge statement instead of an Insert? That will avoid having to pull all the existing keys from the database into your Python program and check them. It will also allow you to update rows in case new data is present. – CJC Dec 26 '16 at 19:14
  • @CJC, do you have an example? – MCM Dec 26 '16 at 20:22
  • @CJC no MERGE in mysql – e4c5 Dec 27 '16 at 02:22
  • just use LOAD DATA INFILE. – e4c5 Dec 27 '16 at 02:36
  • @e4c5, do you have an example how it should work when the same data already exists and if the values in the csv file got updates? – MCM Dec 27 '16 at 10:03
  • Just google for LOAD DATA INFILE first result – e4c5 Dec 27 '16 at 10:04
  • Apologies, there's no natural merge statement, but try this http://stackoverflow.com/questions/725556/how-can-i-merge-two-mysql-tables – CJC Dec 28 '16 at 14:11
  • @e4c5, I solved it by using `LOAD DATA LOCAL INFILE`. The main question is how to update existing values using this statement. – MCM Dec 28 '16 at 21:58

1 Answers1

0

solved it by using LOAD DATA LOCAL INFILE

MCM
  • 1,479
  • 2
  • 17
  • 22