0

I implemented a deep learning neural network from scratch without using any python frameworks like tensorflow or keras.

The problem is no matter what i change in my code like adjusting learning rate or changing layers or changing no. of nodes or changing activation functions from sigmoid to relu to leaky relu, i end up with a training loss that starts with 6.98 but always converges to 3.24...

Why is that?

Please review my forward and back prop algorithms.Maybe there's something wrong in that which i couldn't identify.

My hidden layers use leaky relu and final layer uses sigmoid activation. Im trying to classify the mnist handwritten digits.

code:

#FORWARDPROPAGATION

for i in range(layers-1):
    
    cache["a"+str(i+1)]=lrelu((np.dot(param["w"+str(i+1)],cache["a"+str(i)]))+param["b"+str(i+1)])


cache["a"+str(layers)]=sigmoid((np.dot(param["w"+str(layers)],cache["a"+str(layers-1)]))+param["b"+str(layers)])

yn=cache["a"+str(layers)]
m=X.shape[1]
cost=-np.sum((y*np.log(yn)+(1-y)*np.log(1-yn)))/m

if j%10==0:
    print(cost)
    costs.append(cost)
    

#BACKPROPAGATION

grad={"dz"+str(layers):yn-y}


for i in range(layers):
    grad["dw"+str(layers-i)]=np.dot(grad["dz"+str(layers-i)],cache["a"+str(layers-i-1)].T)/m
    

    grad["db"+str(layers-i)]=np.sum(grad["dz"+str(layers-i)],1,keepdims=True)/m
    
    if i<layers-1:
        grad["dz"+str(layers-i-1)]=np.dot(param["w"+str(layers-i)].T,grad["dz"+str(layers-i)])*lreluDer(cache["a"+str(layers-i-1)])

for i in range(layers):
    param["w"+str(i+1)]=param["w"+str(i+1)] - alpha*grad["dw"+str(i+1)]
    param["b"+str(i+1)]=param["b"+str(i+1)] - alpha*grad["db"+str(i+1)]

1 Answers1

0

The implementation seems okay. While you could converge to the same value with different models/learning rate/hyper parameters, what's frightening is having the same starting value everytime, 6.98 in your case.

I suspect it has to do with your initialisation. If you're setting all your weights initially to zero, you're not gonna break symmetry. That is explained here and here in adequate detail.

Ayush
  • 1,510
  • 11
  • 27