How to increase accuracy of network running on MNIST

Question

I followed this code: https://github.com/HyTruongSon/Neural-Network-MNIST-CPP

It is quite easy to understand. It produces 94% accuracy. I have to convert it to a network with deeper layers(ranging from 5 to 10). In order to make my self comfortable, I only added one more layer. However, no matter how much I train it, accuracy doesn't go beyond 50%. I added 256 neuron in each hidden layer. Here is how I modified my code: I added extra layer like this:

// From layer 1 to layer 2. Or: Input layer - Hidden layer
double *w1[n1 + 1], *delta1[n1 + 1], *out1;

// From layer 2 to layer 3. Or; Hidden layer - 2Hidden layer
double *w2[n2 + 1], *delta2[n2 + 1], *in2, *out2, *theta2;

// From layer 3 to layer 4. Or; Hidden layer - Output layer
double *w3[n3 + 1], *delta3[n3 + 1], *in3, *out3, *theta3;

// Layer 3 - Output layer
double *in4, *out4, *theta4;
double expected[n4 + 1];

Feedforward part is modified this way:

void perceptron() {
    for (int i = 1; i <= n2; ++i) {
        in2[i] = 0.0;
    }

    for (int i = 1; i <= n3; ++i) {
        in3[i] = 0.0;
    }
    for (int i = 1; i <= n4; ++i) {
        in4[i] = 0.0;
    }

    for (int i = 1; i <= n1; ++i) {
        for (int j = 1; j <= n2; ++j) {
            in2[j] += out1[i] * w1[i][j];
        }
    }

    for (int i = 1; i <= n2; ++i) {
        out2[i] = sigmoid(in2[i]);
    }

  /////
     for (int i = 1; i <= n2; ++i) {
        for (int j = 1; j <= n3; ++j) {
            in3[j] += out2[i] * w2[i][j];
        }
    }

    for (int i = 1; i <= 3; ++i) {
        out3[i] = sigmoid(in3[i]);
    }

  ////
    for (int i = 1; i <= n3; ++i) {
        for (int j = 1; j <= n4; ++j) {
            in4[j] += out3[i] * w3[i][j];
        }
    }

    for (int i = 1; i <= n4; ++i) {
        out4[i] = sigmoid(in4[i]);
    }
}

Backpropogation is changed this way:

void back_propagation() {
    double sum;

    for (int i = 1; i <= n4; ++i) {
        theta4[i] = out4[i] * (1 - out4[i]) * (expected[i] - out4[i]);
    }

    for (int i = 1; i <= n3; ++i) {
        sum = 0.0;
        for (int j = 1; j <= n4; ++j) {
            sum += w3[i][j] * theta4[j];
        }
        theta3[i] = out3[i] * (1 - out3[i]) * sum;
    }

    for (int i = 1; i <= n3; ++i) {
        for (int j = 1; j <= n4; ++j) {
            delta3[i][j] = (learning_rate * theta4[j] * out3[i]) + (momentum * delta3[i][j]);
            w3[i][j] += delta3[i][j];
        }
    }

    /////////////

       for (int i = 1; i <= n2; ++i) {
        for (int j = 1; j <= n3; ++j) {
            delta2[i][j] = (learning_rate * theta3[j] * out2[i]) + (momentum * delta2[i][j]);
            w2[i][j] += delta2[i][j];
        }
    }
   /////////////////

    for (int i = 1; i <= n1; ++i) {
        for (int j = 1 ; j <= n2 ; j++ ) {
            delta1[i][j] = (learning_rate * theta2[j] * out1[i]) + (momentum * delta1[i][j]);
            w1[i][j] += delta1[i][j];
        }
    }
}

I am posting my modifications as well because I might be wrong somewhere here. Once I set epochs variable to 1000 and let it train for 24 hours, still no progress :( . I am quite frustrated with it and I don't know where I might be wrong.

score 0 · Answer 1 · answered Oct 31 '18 at 15:55

0

Did you forget to add backpropagation to the thetha2 parameter from layer 3 to 2 ?

for (int i = 1; i <= n2; ++i) {
    sum = 0.0;
    for (int j = 1; j <= n3; ++j) {
       sum += w2[i][j] * theta3[j];
    }
    theta2[i] = out2[i] * (1 - out2[i]) * sum;
}

answered Oct 31 '18 at 15:55

Tezirg

1,629
1
10
20

I can't believe I made such a naive mistake. Let me train again. Hopefully, it runs :) – Nerd Giraffe Oct 31 '18 at 16:24
Hope it will give you a nice SGD :). Please remember to upvote and/or accept the answer if it worked. – Tezirg Oct 31 '18 at 16:26
Do you think that if I change sigmoid to softmax in last layer, it will also help with accuracy ? What other measures should I take – Nerd Giraffe Oct 31 '18 at 17:13

How to increase accuracy of network running on MNIST

1 Answers1