1

I am optimizing my script and find this problem:

Here I have a csv file where the first column is just index and the second column contains a string (sentence of arbitrary length). I want to create two variables "index" and "string" that contains all the index and string respectively. This is my code:

with open(file_name, 'r', encoding="utf8") as csvfile:
    train_set_x = csv.reader(csvfile, delimiter=',', quotechar='|')
    index = [[c[0],c[1]] for c in train_set_x]
    text = [a[1] for a in index]

this does the job, however, it takes 2 iterations. I am asking if there is a cleaner way to do it? Thank you

cs95
  • 379,657
  • 97
  • 704
  • 746
Yi Shen
  • 108
  • 1
  • 10

1 Answers1

1

There definitely is. Use zip with iterable unpacking.

index, text = zip(*((c[0], c[1]) for c in train_set_x))

MCVE:

In [152]: x, y = zip(*[(1, 2), (3, 4), (5, 6)])

In [153]: x
Out[153]: (1, 3, 5)

In [154]: y
Out[154]: (2, 4, 6)
cs95
  • 379,657
  • 97
  • 704
  • 746