Problem with inserts data from Python (to_sql) to SQL Server

Question

I have some problems with symbol #.

Some data in columns contains the symbol #, for example

'JRE#150' 
'July banner #150'

When I am inserting data from file all.csv into SQL Server, records containing this character are not inserted into the table correctly.

What do I mean?!

If I try to insert this value 'JRE#150', only this part 'JRE' is stored, NULL is inserted into other columns.

How the process looks like and what I am doing:

The first independent engine sends me the all.csv file from the API to DataFrame.

The following line is responsible for importing this data into a file.

.csv is:
```
 df.to_csv(r'C:\\...\all.csv',  encoding='utf-8', index=False)
```

Second independent mechanism is doing this:

 df = pd.read_csv(r'C:\\...\all.csv', sep=',', comment='#', encoding='utf-8', low_memory=False)

 df.to_sql(table_name, engine, if_exists = 'replace', chunksize = None, index=False)

How to insert data into SQL Server with #, do not replace for another symbol or delete?

What is the problem here and how can I fix it?

I will be grateful for the help.

I've never heard of this before. I just Googled it now and found this. https://stackoverflow.com/questions/32235696/pandas-to-sql-gives-unicode-decode-error — ASH, Sep 15 '21 at 14:38

score 0 · Answer 1 · answered Sep 15 '21 at 21:35

Remove the comment='#' parameter from pd.read_csv(...).

As per the Pandas read_csv documentation:

comment: str, optional

Indicates remainder of line should not be parsed. If found at the beginning of a line, the line will be ignored altogether. This parameter must be a single character. Like empty lines (as long as skip_blank_lines=True), fully commented lines are ignored by the parameter header but not by skiprows. For example, if comment='#', parsing #empty\na,b,c\n1,2,3 with header=0 will result in a,b,c being treated as the header.

Problem with inserts data from Python (to_sql) to SQL Server

1 Answers1