185

I want to write a function that randomly picks elements from a training set, based on the bin probabilities provided. I divide the set indices to 11 bins, then create custom probabilities for them.

bin_probs = [0.5, 0.3, 0.15, 0.04, 0.0025, 0.0025, 0.001, 0.001, 0.001, 0.001, 0.001]

X_train = list(range(2000000))

train_probs = bin_probs * int(len(X_train) / len(bin_probs)) # extend probabilities across bin elements
train_probs.extend([0.001]*(len(X_train) - len(train_probs))) # a small fix to match number of elements
train_probs = train_probs/np.sum(train_probs) # normalize
indices = np.random.choice(range(len(X_train)), replace=False, size=50000, p=train_probs)
out_images = X_train[indices.astype(int)] # this is where I get the error

I get the following error:

TypeError: only integer scalar arrays can be converted to a scalar index with 1D numpy indices array

I find this weird, since I already checked the array of indices that I have created. It is 1-D, it is integer, and it is scalar.

What am I missing?

Note : I tried to pass indices with astype(int). Same error.

Bedir Yilmaz
  • 3,823
  • 5
  • 34
  • 54

6 Answers6

263

Perhaps the error message is somewhat misleading, but the gist is that X_train is a list, not a numpy array. You cannot use array indexing on it. Make it an array first:

out_images = np.array(X_train)[indices.astype(int)]
DYZ
  • 55,249
  • 10
  • 64
  • 93
  • I see. But, what if the list is too big to be converted to an array? The original X_train is a list of images in fact. – Bedir Yilmaz Jun 23 '18 at 04:42
  • 1
    Then try [shuffling the original list](https://stackoverflow.com/questions/976882/shuffling-a-list-of-objects). – DYZ Jun 23 '18 at 04:52
  • 20
    This saved me a lot of time troubleshooting! (It's a very misleading error message) – Steven Sagona Apr 08 '20 at 20:37
  • 2
    An alternative is to use list comprehension if one insists that `X_train` and `out_images` have to remain a list. `out_images = [X_train[index] for index in indices]` – Nuclear03020704 Jun 10 '21 at 13:24
  • What if I really do have a lists of lists, i.e. I cannot convert to an array (each list is of a different length) – seeker_after_truth Nov 27 '22 at 03:48
  • @seeker_after_truth use list comprehension as mentioned in Nuclear03020704 's answer. – Lu Kas Apr 14 '23 at 12:56
179

I get this error whenever I use np.concatenate the wrong way:

>>> a = np.eye(2)
>>> np.concatenate(a, a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<__array_function__ internals>", line 6, in concatenate
TypeError: only integer scalar arrays can be converted to a scalar index

The correct way is to input the two arrays as a tuple:

>>> np.concatenate((a, a))
array([[1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.]])
Simon Alford
  • 2,405
  • 1
  • 12
  • 10
  • 23
    I fell into the same trap. It's very easy to overlook `np.concatenate()` 's requirement that the arrays to be concatenated must be supplied as a tuple. Many thanks! – András Aszódi Jul 21 '20 at 09:06
12

A simple case that generates this error message:

In [8]: [1,2,3,4,5][np.array([1])]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-55def8e1923d> in <module>()
----> 1 [1,2,3,4,5][np.array([1])]

TypeError: only integer scalar arrays can be converted to a scalar index

Some variations that work:

In [9]: [1,2,3,4,5][np.array(1)]     # this is a 0d array index
Out[9]: 2
In [10]: [1,2,3,4,5][np.array([1]).item()]    
Out[10]: 2
In [11]: np.array([1,2,3,4,5])[np.array([1])]
Out[11]: array([2])

Basic python list indexing is more restrictive than numpy's:

In [12]: [1,2,3,4,5][[1]]
....
TypeError: list indices must be integers or slices, not list

edit

Looking again at

indices = np.random.choice(range(len(X_train)), replace=False, size=50000, p=train_probs)

indices is a 1d array of integers - but it certainly isn't scalar. It's an array of 50000 integers. List's cannot be indexed with multiple indices at once, regardless of whether they are in a list or array.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
1

Another case that could cause this error is

>>> np.ndindex(np.random.rand(60,60))
TypeError: only integer scalar arrays can be converted to a scalar index

Using the actual shape will fix it.

>>> np.ndindex(np.random.rand(60,60).shape)
<numpy.ndindex object at 0x000001B887A98880>
Qin Heyang
  • 1,456
  • 1
  • 16
  • 18
0

Check that you're passing the right arguments. Similar to Simon, I was passing two arrays to np.all when it only accepted one array, meaning that the second array was interpreted to be an axis.

Pro Q
  • 4,391
  • 4
  • 43
  • 92
0

Try to use x_train.shape[] instead.

Amirgiano
  • 69
  • 6
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Aug 17 '22 at 20:49