Use np.genfromtxt to read data of different dtypes in csv file

Question

I am trying to read a csv file of that looks like:

label,value
first,1.234e-01
second,5.678e-02
three,9.876e-03
...

etc

Where the first column contains strings and the second column contains floats.

From the online documentation of np.genfromtxt I thought that the line

file_data = np.genfromtxt(filepath, dtype=[('label','<U'),('value','<f4')], delimiter=',', skip_header=1)

would specify the dtype of each column which would allow it to be read appropriately but when I try to print file_data I get something that looks like

[('', 1.234e-01) ('', 5.678e-02) ('', 9.876e-03) ...]

when I was expecting

[('first', 1.234e-01) ('second', 5.678e-02) ('third', 9.876e-03) ...]

Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. — Community, Feb 20 '23 at 15:05

score 1 · Accepted Answer · answered Feb 20 '23 at 14:28

You need to specify an approximate expected number of unicode chars in dtype (like <U10):

from io import StringIO

data = '''label,value
first,1.234e-01
second,5.678e-02
three,9.876e-03'''

file_data = np.genfromtxt(StringIO(data), dtype=[('label','<U15'),('value','<f4')], delimiter=',', skip_header=1)
print(file_data)

[('first', 0.1234  ) ('second', 0.05678 ) ('three', 0.009876)]

Use np.genfromtxt to read data of different dtypes in csv file

1 Answers1