2

I am interested in copying the content of a file to a multidimensional List in Python.

The file goes like

b,30.83,0,u,g,w,v,1.25,t,t,01,f,g,00202,0,+
a,58.67,4.46,u,g,q,h,3.04,t,t,06,f,g,00043,560,+
a,24.50,0.5,u,g,q,h,1.5,t,f,0,f,g,00280,824,+
b,27.83,1.54,u,g,w,v,3.75,t,t,05,t,g,00100,3,+
b,20.17,5.625,u,g,w,v,1.71,t,f,0,f,s,00120,0,+
b,32.08,4,u,g,m,v,2.5,t,f,0,t,g,00360,0,+

What I want here is to separate the values with commas (',') and newline ('\n', for jumping to the next dimension) ... for example:-

x[0][0]='b', x[0][1]=30.83, x[1][0]='a' .... 

Is there some suggestions? I tried to use csv, but it's too complicated for me to access the values later. Is there any way I could manage to do that with the simple file methods? Thanks in advance.

jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
unknown
  • 343
  • 3
  • 16
  • how does `csv` make it "too complicated (...) to access the values later" ??? – bruno desthuilliers Jan 09 '14 at 11:40
  • The len() method don't work with that, I guess. Plus, I couldn't access with the normal way - x[][]. I just couldn't figure it out how to deal with CSVs. I am a newbie for everything, of course. – unknown Jan 09 '14 at 11:57
  • The `csv` module is well documented, _and_ you can test and explore in the interactive python shell. That's a better way to solve problem than giving up on proven tools and trying to reinvent the wheel just because you didn't grasp the doc at first read. – bruno desthuilliers Jan 09 '14 at 12:08
  • Okay. I'll try to do that, of course. I was just looking for immediate and easy way. Thanks for the recommendation. – unknown Jan 09 '14 at 12:13

3 Answers3

3

Use list comprehension (also remove newline from the input):

>>> x=[i.strip().split(',') for i in open("filename", 'r')]

EDIT: For the input in the question, this would produce:

>>> x
[['b', '30.83', '0', 'u', 'g', 'w', 'v', '1.25', 't', 't', '01', 'f', 'g', '00202', '0', '+'], ['a', '58.67', '4.46', 'u', 'g', 'q', 'h', '3.04', 't', 't', '06', 'f', 'g', '00043', '560', '+'], ['a', '24.50', '0.5', 'u', 'g', 'q', 'h', '1.5', 't', 'f', '0', 'f', 'g', '00280', '824', '+'], ['b', '27.83', '1.54', 'u', 'g', 'w', 'v', '3.75', 't', 't', '05', 't', 'g', '00100', '3', '+'], ['b', '20.17', '5.625', 'u', 'g', 'w', 'v', '1.71', 't', 'f', '0', 'f', 's', '00120', '0', '+'], ['b', '32.08', '4', 'u', 'g', 'm', 'v', '2.5', 't', 'f', '0', 't', 'g', '00360', '0', '+']]
>>> x[0][0]
'b'
>>> x[4][2]
'5.625'
unknown
  • 343
  • 3
  • 16
devnull
  • 118,548
  • 33
  • 236
  • 227
  • 1
    @game4cesc: did you try the code and look at what `x` is afterwards instead of just assuming that the person answering doesn't know what they're doing and therefore must have given you a 1-dimensional list? – Wooble Jan 09 '14 at 12:06
  • Sorry for my naive approach. I'll try to keep my dumbness to the minimum next time. Indeed, this gives me what I exactly looked for. Thanks!!! – unknown Jan 09 '14 at 14:21
1

Edit: some cleanup. Note that with dtype=None, the columns of the array are even parsed to their correct type. If you want strings only, dtype=np.str does the trick. The StringIO is there for a self-contained example, but you can replace it with your filename (see genfromtxt documentation).

import numpy as np
from StringIO import StringIO

text = """b,30.83,0,u,g,w,v,1.25,t,t,01,f,g,00202,0,+
a,58.67,4.46,u,g,q,h,3.04,t,t,06,f,g,00043,560,+
a,24.50,0.5,u,g,q,h,1.5,t,f,0,f,g,00280,824,+
b,27.83,1.54,u,g,w,v,3.75,t,t,05,t,g,00100,3,+
b,20.17,5.625,u,g,w,v,1.71,t,f,0,f,s,00120,0,+
b,32.08,4,u,g,m,v,2.5,t,f,0,t,g,00360,0,+"""

data = np.genfromtxt(StringIO(text), dtype=None, delimiter = ',')

print data['f1']

Also, if subsequent code insists on plain python datastructures, that is no problem. For instance:

print data.tolist()
print zip(*data.tolist())
Eelco Hoogendoorn
  • 10,459
  • 1
  • 44
  • 42
  • Thanks. But, I thought its possible to store different types of values in a list in Python. – unknown Jan 09 '14 at 11:59
  • 1
    genfromtxt returns an array; not a list. arrays have a 'uniform' type. with dtype=None, you are getting back a structured array; the same type for each row, but with different types on each column. With dtype=np.str, you get an unstructured 2d array of type str. unfortunately, it does not seem like you can produce a 2d array of type object, where each entry has already been parsed. But that would be easy to do as a postprocessing step on the subarrays of interest. – Eelco Hoogendoorn Jan 09 '14 at 12:06
1

Found in another SO question:

import csv

with open('filename', 'Ur') as f:
    data = list(list(rec) for rec in csv.reader(f, delimiter=','))
Community
  • 1
  • 1
Nils Werner
  • 34,832
  • 7
  • 76
  • 98
  • Thanks all for the quick help. This is what I was looking for. I have some other things to fix, but this works like a charm. Thanks, Nils. – unknown Jan 09 '14 at 12:10