1

I'm using Python 2.6 (32bits) on Windows 8. The purpose is building a pysqlite database from a comma separated CSV file where the first row is column name.

My code as following (some code skipped):

def BuildDatabaseFromCSV(self, file_name):
    with file(file_name, 'rb') as source:   
        csvreader = csv.reader(source)
        first_entry = csvreader.next()
        self.BuildTable(first_entry)
        #self.AddEntries(csvreader)
        for entry in csvreader:
            self.AddEntry(entry)

def BuildTable(self, cols_names):
    cmd_line = str(cols_names)[1:-1]   #Remove '[]' from list type    
    cmd_line = "create table %s (%s)" % (self._t_name, cmd_line)
    self._cursor.execute(cmd_line)  #Creating table
    self._db.commit()
    self._col_num = len(cols_names)

def AddEntry(self, entry):
    length = len(entry)
    if self._col_num > length: #Padding empty fields
        entry += ([''] * (self._col_num - length))
    elif self._col_num < length: #Crop extra fields
        entry = entry[:self._col_num]
    cmd_line = "insert into %s values (%s)"\
        % (self._t_name, ("?," * self._col_num)[:-1])
    self._cursor.execute(cmd_line, entry)
    self._db.commit()

def AddEntries(self, entries):
    cmd_line = "insert into %s values (%s)"\
        % (self._t_name, ("?," * self._col_num)[:-1])
    self._cursor.executemany(cmd_line, entries)
    self._db.commit()

Originally I used AddEntries() to add entries from CSV file to my database. However, for CSV file generated by Excel 2003, the amount of "field" for each line might be different. Some lines may contains more or less field than the "column name" line (extra or missing commas) if they contain tailing empty columns or some extra garbage.

I cannot use "field_size_limit" since I need to read the CSV first to define it.

Is there a simpler way to achieve any of following purpose?
1. When reading CSV file, specify a fixed amount of columns so it inserts or deletes commas automatically?
2. When inserting a entry to mysqlite database, specify a fixed amount of columns so it can accept variable amount of input columns?

Thanks in advance.

0 Answers0