1

I am currently writing a program that will take data from an excel spreadsheet and insert it into a sql server table that I have created within the program.

I have previously assigned the datetime column to be a nvarchar(250) for the purpose of getting the overall programme to work, however when I go to change it to datetime, the data is inputted into the wrong columns? The rest of the code worked with the nvarchar datatype aswell.

import pyodbc

connection_string = r'connection_string'
data = 'file_path'

conn = pyodbc.connect(connection_string)
cur = conn.cursor()

createtable = """
create table table1(
    ID Int NULL,
    Date datetime(250) NULL,
    City nvarchar(250) NULL,
    Country nvarchar(250) NULL,
    Image nvarchar(250) NULL,
    Length nvarchar(250) NULL,
    Date_Of_capture nvarchar(250) NULL,
    Comments nvarchar(1000) NULL
    )"""

truncatetable = """truncate table table1"""

with open(data) as file:
    file.readline()
    lines = file.readlines()

if cur.tables(table="table1").fetchone():
    cur.execute(truncatetable)
    for line in lines:
        cols = line.split(',')
        cols = line.replace("'", "")
        sql = "INSERT INTO table1 VALUES({}, '{}', '{}', '{}', '{}', '{}','{}','{}')".format(cols[0], cols[1],cols[2], cols[3], cols[4], cols[5], cols[6], cols[7])
        cur.execute(sql)
else:
    cur.execute(createtable)
    for line in lines:
        cols = line.split(',')
        sql = "INSERT INTO table1 VALUES({}, '{}', '{}', '{}', '{}', '{}','{}','{}')".format(cols[0], cols[1],cols[2], cols[3], cols[4], cols[5], cols[6], cols[7])
        cur.execute(sql)

conn.commit()

conn.close()

I would expect the date column to show as a datetime data type whilst being contained within one column however it changes the tables so that all the columns are incorrect and each digit of the date is within a different column?

Any help is greatly appreciated. Thank you.

Parfait
  • 104,375
  • 17
  • 94
  • 125
Mattidge
  • 83
  • 8
  • Please add some data, from both the input and the output. – Aryerez Oct 07 '19 at 11:16
  • 2
    I have to assume that your `create table` statement is not being executed because `datetime(250)` would definitely throw an exception. Also, read up on *parameterized queries*, and [edit] your question to show us what `lines` looks like. – Gord Thompson Oct 07 '19 at 12:32
  • Apologies, I can't figure how to show a sample of the data? Any advice? – Mattidge Oct 07 '19 at 12:40
  • You cannot `open` on an Excel file, a binary format types (.xls, .xlsx, .xlsm, .xlsb). Likely, your data is txt or csv, neither of which are Excel files. – Parfait Oct 07 '19 at 14:07
  • @Parfait - except that Windows will associate .csv with Excel if Office is installed, so in that case one could be forgiven for thinking that .csv is an "Excel file". ("Well, I double-click the file and it opens in Excel....") – Gord Thompson Oct 07 '19 at 14:30
  • @MatthewMeek - `print(lines[:2])` should give you the first two elements in the list. The resulting line might be rather long, but as long as you faithfully copy-and-paste the result into your question then someone can always help you with the formatting after the fact. – Gord Thompson Oct 07 '19 at 14:38
  • Correct @GordThompson and that is a misunderstanding. Excel is simply an interface. Notepad can open CSVs but they are not Notepad files! – Parfait Oct 07 '19 at 14:43

1 Answers1

2

Consider the following best practices:

  • Always specify columns in INSERT INTO even SELECT clauses, specifically use INSERT INTO myTable (Col1, Col2, Col3, ...) which helps in readability and maintainability;

  • Use parameterization with prepared statement to avoid quote escaping or type casting among other important items. Additionally Python allows tuples to be passed into params argument of cursor.execute() without listing each individual column.

  • Use the csv library of Python to traverse CSV files with lists or dictionary for proper alignment and avoid the memory intensive .readlines() call;

  • Combine CREATE TABLE and TRUNCATE in one SQL call to avoid if conditionals with cursor fetch call.

See adjusted code.

import csv
...

action_query = """
    IF EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = N'mytable')
      BEGIN
        TRUNCATE TABLE table1
      END
    ELSE
      BEGIN
        CREATE TABLE table1(
           ID Int NULL,
           Date datetime NULL,
           City nvarchar(250) NULL,
           Country nvarchar(250) NULL,
           Image nvarchar(250) NULL,
           Length nvarchar(250) NULL,
           Date_Of_capture nvarchar(250) NULL,
           Comments nvarchar(1000) NULL
        )
      END
""")

cur.execute(action_query)
conn.commit()

# PREPARED STATEMENT
append_query = """INSERT INTO mytable (ID, Date, City, Country, Image, 
                                       Length, Date_Of_capture, Comments)
                  VALUES (?, ?, ?, ?, ?, ?, ?, ?)
               """

# ITERATE THROUGH CSV AND INSERT ROWS
with open(mydatafile) as f:
    next(f) # SKIP HEADERS
    reader = csv.reader(f)

    for r in reader:
        # RUN APPEND AND BIND PARAMS
        cur.execute(append_query, params=r)
        conn.commit()

cur.close()
conn.close()
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • Perfect thank you! Unfortunately I am required to check if the table exists in python as I am on a learning course however I was able to take bits of this to complete it! Thank you for your response! – Mattidge Oct 08 '19 at 08:08
  • Great to hear and glad to help! Hopefully your learning course emphasizes parameterization. – Parfait Oct 08 '19 at 16:05