-1

this is the code that i run :

labels=[0,1,1,0,2,1,1,1,0,0]
labels_ = np.zeros((10, 3))
labels_

the above code gives the output :

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

now when i run below code "block 2"

labels_[np.arange(10), labels] = 1
labels_

it gives the output :

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 1., 0.],
       [1., 0., 0.],
       [0., 0., 1.],
       [0., 1., 0.],
       [0., 1., 0.],
       [0., 1., 0.],
       [1., 0., 0.],
       [1., 0., 0.]])

can someone explain what happens in the code " block 2" ?

1 Answers1

1

When you use square brackets to index NumPy arrays the first number in the bracket refers to the row, and the second number refers to the column - like a game of battleships.

Now, you're indexing an array of zeros called labels_ using your list called labels, and an array created by np.arange:

labels_ = array([[0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.]]

labels = [0,1,1,0,2,1,1,1,0,0]

np.arange(10) = array([0,1,2,3,4,5,6,7,8,9])

As we said before, NumPy indexing goes [row,column] and you're using the index [np.arange(10), labels]. When you give NumPy multiple values for indexing it uses them in turn, so it looks for the first value in the np.arange(10) array and the first item in the labels list and uses them as the row index and column index for the zeros array, labels_.

We know that the first item in the np.arange(10) is 0 and the first item in labels is also 0, so it looks in your labels_ list for [0,0] - the first row and the first column. You've told it to set the indices to 1 with your = 1 so it does that.

Note that Python counts from 0, so the first row is row 0, the second row is row 1, the third row is row 2 etc. Also note that we count the first row as the top row, and the first column as the left column.

So now we have:

labels_ = array([[1., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.]]

Next it looks at the second value in np.arange(10) and labels and we know that np.arange(10)[1] = 1 and labels[1] = 1 so it sets row 1 (the second row) and column 1 (the second column) to 1.

So now we have:

labels_ = array([[1., 0., 0.],
                 [0., 1., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.]]

Now we look for the third item in np.arange(10) and labels and get an index of [2,1] and that's the third row and second column of the labels_ array, which we set to `:

labels_ = array([[1., 0., 0.],
                 [0., 1., 0.],
                 [0., 1., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.],
                 [0., 0., 0.]]

We keep going like this until we run out of numbers in the lists that we're using for indexing.

Ari Cooper-Davis
  • 3,374
  • 3
  • 26
  • 43