-3

I'm trying to reproduce a neural network from http://neuralnetworksanddeeplearning.com/chap2.html

What i don't get is why they can calculate the gradient descent for the weights by taking a dot product of the error/delta and the transposed activations of the previous layer.

nabla_w[-1] = np.dot(delta, activations[-2].transpose())

delta is a 1-dimensional array. activations[-2] is too. I thought if you transpose a 1 dimensional array you just get a 1-dimensional array.. So this dot product gives only a single number and not a matrix, which we want.

So how can this dot product give me a 2-dimensional matrix?

And is there a smart way to achieve this (calculate gradient descent for the weights) with numpy?

  • Welcome to Stack Overflow! Interesting question you got there, but maybe it is a better fit on https://datascience.stackexchange.com ? – totokaka Dec 29 '18 at 15:05

2 Answers2

0

Calculating the dot product between two vectors i.e. your one dimensional arrays, is supposed to return a single scalar (value). Performing a cross product between two vectors will produce a new vector.

Therefore, it can't result in a matrix. Dot product doesn't produce matrices, only a scalar. np.dot() with two matrices as parameters will return the multplication of the matrices, but that is not the same as the dot product.

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
hatati
  • 379
  • 2
  • 7
0

np.dot computes the dot or inner product of two vectors, if both a and b are 1-D arrays.

For 2D matrices, it simply returns the multiplication of those two matrices. Do not confuse this with dot product, as dot product is only possible on vectors and not matrices.