1

The csv file has the following structure:

a,b,c
a,b,c,d,e,f,g
a,b,c,d
a,b,c

if I use file = pd.read_csv('Desktop/export.csv',delimiter=','), it will throw a tokenizing error like this: pandas.errors.ParserError: Error tokenizing data. C error: Expected 9 fields in line 3, saw 10

I do NOT want to skip bad lines. I want to read the csv with all columns and create a dataframe that looks like:

unnamed column1, unnamed column2, ....... unnamed column 7
a,b,c
a,b,c,d,e,f,g
a,b,c,d
a,b,c

How can I load the bad lines in the csv files?

ichino
  • 19
  • 3
  • Re https://stackoverflow.com/q/75242879 ``drop database `b'MavenFuzzyFactory'`;``. Just enclose the identifier that has not normally allowed characters in backticks. – ysth Jan 27 '23 at 04:03
  • 1
    if that doesn't work, likely there are other characters you aren't seeing in the name; do `select SCHEMA_NAME,hex(SCHEMA_NAME) from information_schema.SCHEMATA;` to see what they might be. the name you report would just have 62274D6176656E46757A7A79466163746F727927 – ysth Jan 27 '23 at 05:34

1 Answers1

0

You can use the error_bad_lines set to false.

import pandas as pd

file = pd.read_csv('Desktop/export.csv', delimiter=',',error_bad_lines=False)
iohans
  • 838
  • 1
  • 7
  • 15