Implementing back propagation using numpy and python for cleveland dataset

Question

I wanted to predict heart disease using backpropagation algorithm for neural networks. For this I used UCI heart disease data set linked here: processed cleveland. To do this, I used the cde found on the following blog: Build a flexible Neural Network with Backpropagation in Python and changed it little bit according to my own dataset. My code is as follows:

import numpy as np
import csv

reader = csv.reader(open("cleveland_data.csv"), delimiter=",")
x = list(reader)
result = np.array(x).astype("float")

X = result[:, :13]
y0 = result[:, 13]
y1 = np.array([y0])
y = y1.T

# scale units
X = X / np.amax(X, axis=0)  # maximum of X array

class Neural_Network(object):
    def __init__(self):
        # parameters
        self.inputSize = 13
        self.outputSize = 1
        self.hiddenSize = 13

        # weights
        self.W1 = np.random.randn(self.inputSize, self.hiddenSize)  
        self.W2 = np.random.randn(self.hiddenSize, self.outputSize)  

    def forward(self, X):
        # forward propagation through our network
        self.z = np.dot(X, self.W1)  
        self.z2 = self.sigmoid(self.z)  # activation function
        self.z3 = np.dot(self.z2, self.W2)  
        o = self.sigmoid(self.z3)  # final activation function
        return o

    def sigmoid(self, s):
        # activation function
        return 1 / (1 + np.exp(-s))

    def sigmoidPrime(self, s):
        # derivative of sigmoid
        return s * (1 - s)

    def backward(self, X, y, o):
        # backward propgate through the network
        self.o_error = y - o  # error in output
        self.o_delta = self.o_error * self.sigmoidPrime(o)  # applying derivative of sigmoid to error

        self.z2_error = self.o_delta.dot(
            self.W2.T)  # z2 error: how much our hidden layer weights contributed to output error
        self.z2_delta = self.z2_error * self.sigmoidPrime(self.z2)  # applying derivative of sigmoid to z2 error

        self.W1 += X.T.dot(self.z2_delta)  # adjusting first set (input --> hidden) weights
        self.W2 += self.z2.T.dot(self.o_delta)  # adjusting second set (hidden --> output) weights

    def train(self, X, y):
        o = self.forward(X)
        self.backward(X, y, o)


NN = Neural_Network()
for i in range(100):  # trains the NN 100 times
    print("Input: \n" + str(X))
    print("Actual Output: \n" + str(y))
    print("Predicted Output: \n" + str(NN.forward(X)))
    print("Loss: \n" + str(np.mean(np.square(y - NN.forward(X)))))  # mean sum squared loss
    print("\n")
    NN.train(X, y)

But when I run this code, my all predicted outputs become = 1 after few iterations and then stays the same for up to all 100 iterations. what is the problem in the code?

See this lovely [debug](https://ericlippert.com/2014/03/05/how-to-debug-small-programs/) blog for help. Please explain more about where your program is deviating from the expected results. — Prune, Apr 30 '18 at 17:33
I think something is wrong with my logic of back propagation which makes the predicted outputs = 1 after few iterations. — Tarun Khare, Apr 30 '18 at 18:39

score 3 · Answer 1 · answered May 13 '18 at 08:37

Few mistakes that I've noticed:

The output of your network is a sigmoid, i.e. a value between [0, 1] -- suits for predicting probabilities. But the target seems to be a value between [0, 4]. This explains the desire of the network to maximize the output to get as close as possible to large labels. But it can't go more than 1.0 and gets stuck.

You should either get rid of the final sigmoid or pre-process the label and scale it to [0, 1]. Both options will make it learn better.
You don't use the learning rate (effectively setting it to 1.0), which is probably a bit high, so it's possible for the NN to diverge. My experiments showed that 0.01 is a good learning rate, but you can play around with that.

Other than this, your backprop seems working right.

Implementing back propagation using numpy and python for cleveland dataset

1 Answers1