-1

I have been hard at work learning how to write custom neural network understanding without a library of any sort

it is a fairly complicated concept but i have read enough and watched enough videos to think i get it

but my network is not reliable

and weirdly; sometimes it goes backwards as in my prediction/confidence goes from high to low

here is what i have understood so far

going forward its

weights * inputs + bias

i wrote a dotproduct that as far as i can tell is working fine

then its activation to see if that particular node is firing or not

in my hidden layers im using ReLU and in my output layer im using Softmax those seem to be working fine also

and then calculate error which im using cross entropy and one hot encoded targets

as far as i can see going forward my code works fine

but going backward is i think where my understanding fails

I spent alot of time trying to understand backward propagation and im almost there

1st off

I understand that it is a measure of how much change is necessary to make this prediction better also known as gradient descent

so because there are so many variables it becomes a chain rule on multiple variables

if you summarize that concept it becomes

'Error/with respect to weight

or

  • on the output layer because the error is directly connected to the output ('Error[index] / Output [index]) * ('Output[index] / Input[index] ) * (`Input[index] / weight)

  • on the hidden layer

because the error is a compilation of the total error of the previous layer ('ErrorTotal[index] / Output [index]) * ('Output[index] / Input[index] ) * (`Input[index] / weight)

where ErrorTotal is all of the errors from the previous layer multiplied by the weights that connect to the nodes from the previous layer to the current node at the current index

Pragmatically you would

1 calculate the derived activation with respect to the value coming from the dot product 2 calculate the dot product with respect to the weights - (which is the output of the node that was sent to it originally)

  /**
   * This function sends the previous layers derived activation value
   * to the current layer in back propagation as well as calculate
   * the previous layers as well as update the previous layers weights per node
   * @param   current_layer
   * @param previous_layer_in_back_propagation
   * @constructor
   */
  Pass_Backward(current_layer , previous_layer_in_back_propagation ) {
    /*
       you have to go backward - the current index increments after a full iteration of the prvious weight index
       all the weights have to point to the same current node

       so current foreach
          nested previous foreach
     */

    let inside_last_layer = (previous_layer_in_back_propagation.Get_Meta_Tag() == Enumerations.Meta_Tags.Output_Layer);
    let activated_error_derivatives = null;

    if(!inside_last_layer) {
      activated_error_derivatives = previous_layer_in_back_propagation.Get_Entire_Node_List_Total_Error_Values_As_Array();
    }
    else {
      activated_error_derivatives = previous_layer_in_back_propagation.Get_Entire_Node_List_Derived_Activation_Values_As_Array();
    }



    current_layer.Get_Entire_Node_List().forEach((current_node, current_node_index) => {
        // training the last layer

        let dot_product = null;

        previous_layer_in_back_propagation.Get_Entire_Node_List().forEach((previous_node, previous_node_index) => {
            let weights_derivatives = [];

            previous_node.Get_Weights().forEach((previous_nodes_weight, previous_nodes_weight_index) => {
                weights_derivatives.push(previous_node.Get_Derived_Activation_Value() * current_node.Get_Value());
            });

            previous_node.Set_Weights_Derivative_Values(weights_derivatives);

            dot_product = Maths.Dot_Product(activated_error_derivatives, weights_derivatives);

        });



        current_node.Set_Nodes_Total_Error(current_node.Get_Value() * dot_product );

    });

    this.Update_Weights_In_Backpropagation(  previous_layer_in_back_propagation );

  }```

but like i said im missing something

my network is not reliable 

sometimes it learns - confidence in prediction increases
sometimes its everywhere
sometimes it unlearns - confidence in prediction decreases

please help 

i have included my backward propagation which happens after i derive my activation 

i have read soooo much and watched sooo many tutorials and written so many variations and havent got it yet

UPDATE: The predictions become unstable when i introduce negative weight initializations into the system
thomedy
  • 93
  • 11

0 Answers0