0

I got a comma separated csv file from Scopus. Each row of the file has such a structure:

"A, B, C,D","1111;2222;3333;4444;","A,B,C",1111,"ABCDE","XYZ",,,"338","347",,,"11.10000/111-2-642-35236-2_34",Conference Paper,,Scopus,2-s2.0-1243213123

Although it is comma separated, in some fields (like the first one) there are internal commas which raise error when I use pandas.DataFrame.from_csv as pandas cannot distinguish separator commas and non-separator commas. Is there any way that I can load such a csv file into a dataframe?

amiref
  • 3,181
  • 7
  • 38
  • 62
  • 1
    "pandas cannot distinguish separator commas and non-separator commas" - that shouldn't happen, unless you mess up on `quotechar`. I can read your line without any errors using `pd.read_csv('a.csv', header=None)`. Please provide a [mcve] that exhibits your problem. – Amadan Oct 29 '18 at 11:00

1 Answers1

1

if the separator is comma then :

df= pd.read_csv("file.csv", delimiter = ',' , header = None) 

empty values are considered as NaN

         0                     1      2     3         ...                        13  14      15                 16
0  A, B, C,D  1111;2222;3333;4444;  A,B,C  1111        ...          Conference Paper NaN  Scopus  2-s2.0-1243213123

Pandas will detect commas as delimiter and no - delimiters like in the string "A,B,C,D"

SimbaPK
  • 566
  • 1
  • 7
  • 26