8

I have loaded a .csv file in python with numpy.genfromtxt. Now it returns a 1 dimensional numpy.ndarray with in that array, numpy.void objects which are actually just arrays of integers. However I would like to convert these from typenumpy.void to numpy.array. To clarify:

>>> print(train_data.shape)
(42000,)
>>> print(type(train_data[0]))
<class 'numpy.void'>
>>> print(train_data[0])
(9, 0, 0)

So here the array (9, 0, 0) which has type numpy.void should be a numpy.array.

How can I convert all values from train_data to be numpy arrays?

Efficiency is also somewhat important because I am working with a lot of data.

Some more code

>>> with open('filename.csv', 'rt') as raw_training_data:
>>>     train_data = numpy.genfromtxt(raw_training_data, delimiter=',', names=True, dtype=numpy.integer)
>>> print(train_data.dtype)
[('label', '<i4'), ('pixel0', '<i4'), ('pixel1', '<i4')]
>>> print(type(train_data))
<class 'numpy.ndarray'>
Tristan
  • 2,000
  • 17
  • 32
  • You should show the `genfromtxt` call. What's `train_data.dtype`? My guess it is a structured array. It is 1d with multiple fields, which are accessed by field name. Whether it is easy to convert to 2d numeric dtype will depend on the field dtypes. – hpaulj Oct 30 '18 at 16:18
  • @hpaulj I added the `train_data.dtype`. – Tristan Oct 30 '18 at 16:28
  • 1
    `train_data['label']` is the first field, etc. If you want a 2d array with 3 columns, try `skip_header=1` instead of `names=True`. Since the fields are all `i4` we could convert this after loading, but loading in the desired format will be simpler. – hpaulj Oct 30 '18 at 16:34
  • Does this answer your question? [How to slice a numpy.ndarray made up of numpy.void numbers?](https://stackoverflow.com/questions/44295375/how-to-slice-a-numpy-ndarray-made-up-of-numpy-void-numbers) – Behdad Abdollahi Moghadam Mar 02 '22 at 09:02

3 Answers3

2

I know it is too late to answer this. But found a solution for a similar problem I had, thanks to the solution provided in this question. If you can convert the train_data to list and then convert it to an numpy array, that would do the job for you.

print(np.array(train_data.tolist()).shape)
nickY
  • 95
  • 3
  • 9
1

Use the numpy.asarray() method, which converts an input to an array

array=numpy.asarray(train_data[0])
samlli
  • 106
  • 5
  • Sorry if I was unclear. I meant converting all 'void' arrays in the normal array. So by doing something creating an array with all numpy arrays in it. – Tristan Oct 30 '18 at 15:01
0

You can do it by calling view with the correct dtype and shape, for your case you can do

 train_data= train_data.view((np.int_, 3))
SzymonO
  • 432
  • 3
  • 15