0

I have a csv file with first 2 entries like this:

"objectId","url"
"1","someUrl1"
"2","[\"SomeUrl2\",\"SomeUrl3\"]"

I want to read the csv in python such that I can extract the id and the url as has to be a single variable irrespective of whether it is a string or an array of strings. Each row will have exactly one id. Urls can be one as shown above.

  • For 1: Id I need: 1. url I need: "someUrl1"
  • For 2: Id I need: 2. url I need: "["SomeUrl2","SomeUrl3"]"

I tried reading the csv as usual.

def loadList(fileName):
inpFile = open(fileName, "r")
li = list()

with inpFile:
    csvreader = csv.reader(inpFile)
    for row in csvreader:
        print(row,"\n")
        # line = row.strip()
        li.append(row)

inpFile.close()
return li

But this delimits across all commas and thats not what I need

1 Answers1

1

In the csv module, the default escape character is None, meaning that no escaping using the backslash character is processed, neither on input nor on output. You must explicitely set it:

# escape chars are doubled here because one is eaten by the interpretor
t = '''"objectId","url"
"1","someUrl1"
"2","[\\"SomeUrl2\\",\\"SomeUrl3\\"]"
'''
with io.StringIO(t, newline='') as fd:
    rd = csv.reader(fd, delimiter=',', escapechar='\\')
    for row in rd:
        print(row)

gives as expected:

['objectId', 'url']
['1', 'someUrl1']
['2', '["SomeUrl2","SomeUrl3"]']

But beware: for the second row, the url is not a list, but the string representation of a list...

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • Thanks for your answer. But I cannot modify the csv file. My file looks like as mentioned in the question I have to read it in this format itself. – Tushar Ahuja Mar 06 '23 at 08:06
  • "no escaping is processed neither on input nor on output." This is not quite correct (though this fact is irrelevant to the OP's problem). The "standard" CSV escapes double quotes by doubling (`"Montgomery ""Scotty"" Scott"` is read as `Montgomery "Scotty" Scott`) and `csv.reader` by default does this correctly. – Amadan Mar 06 '23 at 08:23