-2

When computing the delta values for a neural network after running back propagation :

enter image description here

the value of delta(1) will be a scalar value, it should be a vector ?

Update :

Taken from http://www.holehouse.org/mlclass/09_Neural_Networks_Learning.html

Specifically : enter image description here

blue-sky
  • 51,962
  • 152
  • 427
  • 752

1 Answers1

1

First, you probably understand that in each layer, we have n x m parameters (or weights) that needs to be learned so it forms a 2-d matrix.

n is the number of nodes in the current layer plus 1 (for bias)
m is the number of nodes in the previous layer.

We have n x m parameters because there is one connection between any of the two nodes between the previous and the current layer.

I am pretty sure that Delta (big delta) at layer L is used to accumulate partial derivative terms for every parameter at layer L. So you have a 2D matrix of Delta at each layer as well. To update the i-th row (the i-th node in the current layer) and j-th column (the j-th node in the previous layer) of the matrix,

D_(i,j) = D_(i,j) + a_j * delta_i
note a_j is the activation from the j-th node in previous layer,
     delta_i is the error of the i-th node of the current layer
so we accumulate the error proportional to their activation weight.

Thus to answer your question, Delta should be a matrix.

greeness
  • 15,956
  • 5
  • 50
  • 80
  • thanks but my question is why is a scalar is being outputted instead of a Matrix as error * (a)transpose is a scala. perhaps the link I pointed to is incorrect ? – blue-sky May 12 '16 at 21:00
  • 1
    error is nx1 and transpose of a is 1xm so the product is nxm. you probably calculated with (1xn) x (nx1) so it becomes a scalar. – greeness May 12 '16 at 21:02