Neural Network does not converge when using RELU or Leaky Relu

Question

I have programmed a simple NN library which creates a neural network with any size chosen and can train the network with a given activation function and its derivative. The networks do very well job with sigmoid as activation function but never converge with relu or leaky relu. I am using the problem of separating 2 groups of points on 2D grid as a standard problem to detect easily if the network is converging. This is the result of using sigmoid to solve the problem :: (x>0 and y>0) or (x<0 and y<0)

And this is the result of using Leaky Relu to solve the same problem:

I have implemented the predict and back propagate functions myself so i thought that there might be the problem but the fact that the network does well with sigmoid made me think that relu and leaky-relu are the problem. I have implemented them also myself but I can't see the problem in my implementation. Here is my Relu implemetation and its derivative:

## RelU
def Relu(num):
    return np.where(num>0,num,0)


## Relu derivative
def Relu_deriv(num):
    num=np.where(num>0,num,0)
    num=np.where(num==0,num,1)

    return num

Here is my Leaky relu implementation with its derivative:

## leaky RelU
def L_Relu(num):
    return np.where(num>0,num,0.01*num)

## leaky RelU deriv
def L_Relu_D(num):
    num=np.where(num<=0,num,1)
    num=np.where(num>=0,num,0.01)

    return num

I have also uploaded the code of the whole "library" to GitHub: Click to see the code in GitHub

You can see the code of the predict and back propagation functions in main.py. If you want to test the networks on the problem of separating two groups of points you can run the test functions in test_problem.py

I would really appreciate any advices about the reason behind the problem and how to solve it. Note: i though the problem might be dying relu but this should not happen with leaky relu.

Please do not format Python code as Javascript snippets (edited). — desertnaut, Nov 13 '20 at 20:14
Hi, what type do the entries of your `num` array have? Btw, I think you can reduce the derivative functions so they only use one `where`. I would change the derivative function of the LeakyRelu to: `num=np.where(num<=0, 0.01, 1.0)`. This version delivers a nonzero value for every point. Your version returns 0 at point 0. But I think this is not the reason for the behaviour you observed. Have you already checked your weight initialization to exclude it as the reason? — jottbe, Nov 13 '20 at 22:18
I have tried to add the code in normal code snippets but i had problems in my browser. Sorry about that — Ashraf Beshtawi, Nov 14 '20 at 13:13

Neural Network does not converge when using RELU or Leaky Relu

0 Answers0