0

there is a case i have whereby a df read from csv and then written to table includes space for certain cols.

here is a portion of the csv referenced as 'filename' below:

id, l, s
1,a,ay
2,b,bee
3,c,see

imports and sqlite connection:

import pandas as pd
import sqlite3
conn = sqlite3.connect('path_to_db')
c = conn.cursor()

then when running this (left out the header option as it is inferred to be 0 i believe):

>>> df = pd.read_csv('filename')
>>> df
  id   l    s
0  1   a   ay
1  2   b  bee
2  3   c  see

this is the part where i hope there is to be a fix once inserted...

>>> df.to_sql('table1', conn, if_exists='append', index=False)

so after checking for the table created, it included a space for col 2-3, based on running this:

>>> for items in c.execute('select * from sqlite_master where type="table" and name="table1"'):
...     print(items)
... 
('table', 'table1', 'table1', 101, 'CREATE TABLE "table1" (\n"id" INTEGER,\n  " l" TEXT,\n  " s" TEXT\n)')

my question is there a way to tell the to_sql function to ignore that?

U13-Forward
  • 69,221
  • 14
  • 89
  • 114
etch_45
  • 792
  • 1
  • 6
  • 21
  • Note that you have space after comma on the first line in the csv file. the easiest would be to just pass correct header, without spaces when read the csv – buran Sep 21 '21 at 10:31
  • Adding to @buran's comment, this could be done using `df = pd.read_csv('filename', names=['id', 'l', 's'])` or after reading the data with `df.rename(columns={' l': 'l', ' s': 's'}, inplace=True)` – LemonPy Sep 21 '21 at 10:36
  • @buran, i can see that on this simple case but not for instances of dozens or hundreds. – etch_45 Sep 21 '21 at 10:42
  • what dozens and hundreds? – buran Sep 21 '21 at 10:43
  • @IftahP in this read with the names, wouldn't you have to know that in advance so that is like mapping it correct? but what about the times when some have spaces while others do not or like my first comment when there are too many cols to sift thru – etch_45 Sep 21 '21 at 10:44
  • `df.rename(columns={col:col.strip() for col in df.columns}, inplace=True)` – buran Sep 21 '21 at 10:48
  • df.columns = [x.strip() for x in df.columns] – Arkadiusz Sep 21 '21 at 10:49
  • @etch_45 as you illustrated your question I illustrated the answer. You can see other comments for some ways to fine-tune it to general cases. – LemonPy Sep 21 '21 at 10:54

0 Answers0