0

I'm currently trying to manually compute the output probabilities of my neural network, using weights matrices and bias vectors, provided by the mlpclassifier from python's library. The goal is to have the same output from the mlp.predict_proba. Unfortunately, due to unknown reason I'm not able to compute it. Firstly I perform the inner product between the test data and the first weight matrix, add the bias vector from the same layer, and then compute the activation function ('relu', in this case)... and so on till the output layer. Below you can find the code I'm using with some additional notes.

# compute predictions using matrixes of weights
import numpy as np
# matrixes of weights and bias 
theta1 = mlp.coefs_[0]      # 13 x 14 matrix
bias1 = mlp.intercepts_[0]  # 14 x 1 vector
theta2 = mlp.coefs_[1]      # 14 x 13 matrix
bias2 = mlp.intercepts_[1]  # 13 x 1 vector
theta3 = mlp.coefs_[2]      # 13 x 12 matrix
bias3 = mlp.intercepts_[2]  # 12 x 1 matrix
theta4 = mlp.coefs_[3]      # 12 x 3 matrix
bias4 = mlp.intercepts_[3]  # 3 x 1 vector

def relu(X):
    return np.maximum(0,X)

def probCalc(X_test): # X_test with attributes along columns 45x13 matrix,values between 0 - 1.

    # number of layers calculation
    nLayer = len(mlp.hidden_layer_sizes)
    j = True

    # weights matrixes and layers calculation
    for i in range(nLayer+1):

        if j == True:
            theta = mlp.coefs_[i]
            bias = mlp.intercepts_[i]
            a = relu(np.dot(X_test, theta) + bias)
            j = False

        else:
            theta = mlp.coefs_[i]
            bias = mlp.intercepts_[i]
            a = relu(np.dot(a, theta) + bias)

    return a

myProbCalc = probCalc(X_test)

Thank you in advance :) Joao

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459

1 Answers1

0

It is high-likely that your final layer doesn't have a relu activation actually. There is no reason to get numbers between 0 and 1 (whose sum is 1 on top of that) with a relu activation.

Replace the last layer activation with a softmax function instead.

Tristan Nemoz
  • 1,844
  • 1
  • 6
  • 19
  • I've tried your approach, computing the last layer with the softmax function, nevertheless didn´t worked.On the other hand we should mantain the same action function along the layers, our I'm thinking wrongly? – JPatricio May 14 '20 at 15:02
  • It is very common not to have the same activation function on the last layer. What Python library do you use? – Tristan Nemoz May 14 '20 at 20:09