0

I have a file with many lines like so:

1,50,"asasd",25
2,51,"apa,asdfi",26
.....
.....

How do I split the second line by comma skipping the comma inside the string inside the double quotes?

I need the result for second line as :

[2,51,"apa,asdfi",26]

Currently I am trying:

x = line.split(',')

The result is like :

['2','51','"apa','asdfi"','26']
Iron Fist
  • 10,739
  • 2
  • 18
  • 34
Srinivasan A
  • 51
  • 3
  • 12

3 Answers3

4

As said in comment, try to read your file as a .csv, it will do great with the coma inside the string. If you have trouble using it, you can read the doc https://docs.python.org/2/library/csv.html orsee some examples there https://dzone.com/articles/python-101-reading-and-writing

import csv

with open(file.csv,'r') as f:
    spamreader = csv.reader(f,delimiter=",")
    for row in spamreader:
        # a row is a list containing all elements in a line
        print row

result:

['1','50','"asasd"','25']  
['2','51','"apa,asdfi"','26']
Whitefret
  • 1,057
  • 1
  • 10
  • 21
2

You can also extract them with re.findall:

>>> s = '2,51,"apa,asdfi",26'
>>> 
>>> re.findall(r'(\d+|".*")', s)
['2', '51', '"apa,asdfi"', '26']

Though I do recommend the method mentioned in the dup

Community
  • 1
  • 1
Iron Fist
  • 10,739
  • 2
  • 18
  • 34
1

You can try the following code:

line = '2,51,"apa,asdfi",26'
result = line.split(",")

length = len(result)
for i in range(length):
    if '"' in result[i]:
        result[i] += "," + result[i+1]
        result.remove(result[i+1])
        break

length = len(result)
for i in range(length):
    if result[i].isdigit():
        result[i] = int(result[i])

print(result)

Output:

[2, 51, '"apa,asdfi"', 26]
Ren
  • 2,852
  • 2
  • 23
  • 45