If you just iterate on the 'rows' of `data:
In [321]: [one_hot(i, 7) for i in data]
Out[321]:
[array([[0.],
[1.],
[0.],
[0.],
[0.],
[0.],
[0.]]),
array([[0.],
[0.],
[1.],
[0.],
[0.],
[0.],
[0.]]),
array([[0.],
[0.],
[0.],
[1.],
[0.],
[0.],
[0.]]),
array([[0.],
[0.],
[0.],
[0.],
[1.],
[0.],
[0.]])]
Since you tried to initialize new_data
to (7,5), I suspect you want something more like:
In [322]: np.hstack(_)
Out[322]:
array([[0., 0., 0., 0.],
[1., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
If you specify an otypes
you'd get:
In [326]: f = np.vectorize(one_hot,otypes=[object])
In [327]: f(data,7)
Out[327]:
array([[array([[0.],
[1.],
[0.],
[0.],
[0.],
[0.],
[0.]])],
[array([[0.],
[0.],
[1.],
[0.],
[0.],
[0.],
[0.]])],
[array([[0.],
[0.],
[0.],
[1.],
[0.],
[0.],
[0.]])],
[array([[0.],
[0.],
[0.],
[0.],
[1.],
[0.],
[0.]])]], dtype=object)
That's a (4,1) array, corresonding to the (4,1) shape of your data
. It could be turned into a (7,4) array, np.hstack(_[:,0])
.
vectorize
does not promise speed; with signature
as suggested in a comment performance is even worse. As long as your data
is (n,1), I don't see the point to using vectorize
.
But, why not populate a new_data
array with one step?
In [337]: new_data = np.zeros((7,5),int)
In [338]: new_data[data[:,0]-1,np.arange(4)] =1
In [339]: new_data
Out[339]:
array([[0, 0, 0, 0, 0],
[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 1, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])