I am trying to read a unicode data file to a few lists. I have a mixed unicode/integer/float data file of this format:
Է 1335 1.1
դ 1380 1.2
32 1.3
ն 1398 1.4
ե 1381 1.5
ր 1408 1.6
I am reading the file with numpy genfromtxt
according to this question numpy.genfromtxt:
decodef = lambda x: x.decode("utf-8")
arr = np.genfromtxt("./data_files/data", delimiter="\t", dtype="U1, i4, f8", converters={0: decodef})
This gives me a numpy.ndarray
not containing spaces, but empty elements for spaces in the first column:
('Է', 1335, 1.1)
('դ', 1380, 1.2)
('', 32, 1.3)
('ն', 1398, 1.4)
('ե', 1381, 1.5)
('ր', 1408, 1.6)
I have already tried to solve the space issue with autostrip=False (the default value)
, missing_values=" "
, replace_space='_'
parameters, but still get the same array with empty items for the spaces. I guess all this parameters are intended only for delimiter manipulation?!
Any ideas how to overcome this?
Python version 3.4.5 is being used.