I have an script in python that basically have this workflow
- Receive a csv with header.
- Create table on database on PostgreSQL where the fields are the headers of csv.
- Copy the data of csv to table created in step2
here a fragment of code of step 3
file_object = open(file_csv)
cur = connection.cursor()
copy_sql = """
COPY %sFROM stdin WITH CSV HEADER
DELIMITER as '""" + delimiter +"'"
cur.copy_expert(sql=copy_sql % table,file = file_object)
connection.commit()
cur.close()
This script works ok, but some csv inputs have the last columns without header and the code above fails
File "copy_to_psql.py", line 18, in load_csv_psql
cur.copy_expert(sql=copy_sql % table,file = file_object)
psycopg2.DataError: extra data after last expected column
Is there any way to only select the the columns with headers from csv?
Is there any solution using only PostgreSQL?
Any other suggestion?
Thanks in advance