How to calculate gradient descent cost for the weights using a dot product?

Question

I'm trying to reproduce a neural network from http://neuralnetworksanddeeplearning.com/chap2.html

What i don't get is why they can calculate the gradient descent for the weights by taking a dot product of the error/delta and the transposed activations of the previous layer.

nabla_w[-1] = np.dot(delta, activations[-2].transpose())

delta is a 1-dimensional array. activations[-2] is too. I thought if you transpose a 1 dimensional array you just get a 1-dimensional array.. So this dot product gives only a single number and not a matrix, which we want.

So how can this dot product give me a 2-dimensional matrix?

And is there a smart way to achieve this (calculate gradient descent for the weights) with numpy?

Welcome to Stack Overflow! Interesting question you got there, but maybe it is a better fit on https://datascience.stackexchange.com ? — totokaka, Dec 29 '18 at 15:05

score 0 · Accepted Answer · edited Dec 29 '18 at 16:30

Calculating the dot product between two vectors i.e. your one dimensional arrays, is supposed to return a single scalar (value). Performing a cross product between two vectors will produce a new vector.

Therefore, it can't result in a matrix. Dot product doesn't produce matrices, only a scalar. np.dot() with two matrices as parameters will return the multplication of the matrices, but that is not the same as the dot product.

score 0 · Answer 2 · answered Dec 29 '18 at 15:53

np.dot computes the dot or inner product of two vectors, if both a and b are 1-D arrays.

For 2D matrices, it simply returns the multiplication of those two matrices. Do not confuse this with dot product, as dot product is only possible on vectors and not matrices.

How to calculate gradient descent cost for the weights using a dot product?

2 Answers2