I want to insert a pandas dataframe into a sqltable that I have pre-created using SQL alchemy.
However, on doing so I get duplicates, as SQLite has an index column and when I copy from the dataframe, it is taking a different index and even if the data is the same, it may append it.
Details about my status:
My pandas dataframe does not have an index on insert. (index=False) Also even though index is primary key, it is the remaining data that is important. No data row shall be duplicate, e.g. have the same combination of contents for each of the columns.
My SQL alchemy create table: (From database.py)
class SQLTable(Base):
__tablename__ = 'sqltable'
__table_args__ = {'sqlite_autoincrement': True}
index = Column(Integer, primary_key= True, nullable=False)
A = Column(Integer, nullable=True)
B = Column(String, nullable=True)
C = Column(BigInteger, nullable=True)
D = Column(Integer, nullable=True)
E = Column(String, nullable=True)
F = Column(Float, nullable=True)
However, I only wish to insert into the sqltable if the line other than the index is not already existant. I found python-pandas-tosql-only-insert-new-rows however I dont know how to apply this to my data.
df.to_sql(
name="sqltable",
con=engine,
if_exists='replace',
index=False,
dtype={
'index': sqlalchemy.types.INTEGER(),
'A': sqlalchemy.types.INTEGER(),
'B': sqlalchemy.types.VARCHAR(length=255),
'C': sqlalchemy.types.BIGINT(),
'D': sqlalchemy.types.VARCHAR(length=255),
'E': sqlalchemy.types.VARCHAR(length=255),
'F': sqlalchemy.types.Float(precision=3, asdecimal=True)
}
)
How to go about the index? What is the best way to do this?
Online tutorial for inserting new rows into SQL table