This is the boolean function I try to fit. [boolean function description][1]
Theoretically we need a neural network with 1 hidden layer which has 3 neurons at least.
And that's actually how I built the neural network in Pytorch.
However, despite the prediction of NN usually correct, the parameters (I mean the weights and bias) not as expected.
I expect the parameters to be like this way(A perceptron operation is equivalent to a Boolean gate):
[perceptron equivalent to a boolean gate][2]
Here the key code:
class NeuralNetwork(nn.Module):
def __init__(self):
super(NeuralNetwork, self).__init__()
self.linear_relu_stack = nn.Sequential(
nn.Linear(4, 3),
nn.ReLU(),
nn.Linear(3, 1),
)
def forward(self, x):
logits = self.linear_relu_stack(x)
return logits
base_lr = 0.001
optimizer = torch.optim.Adam(model.parameters(), base_lr)
criterion = nn.MSELoss().to(device)
Here is the key output:
First, the prediction not so bad, The categories are correct, but some of the numbers are not precise enough
w x y z pred
0 0 0 0 0 [tensor(0.9992, grad_fn=<UnbindBackward0>)]
1 0 0 0 1 [tensor(0.2459, grad_fn=<UnbindBackward0>)]
2 0 0 1 0 [tensor(0.9992, grad_fn=<UnbindBackward0>)]
3 0 0 1 1 [tensor(0.0040, grad_fn=<UnbindBackward0>)]
4 0 1 0 0 [tensor(0.9992, grad_fn=<UnbindBackward0>)]
5 0 1 0 1 [tensor(0.7707, grad_fn=<UnbindBackward0>)]
6 0 1 1 0 [tensor(-0.0015, grad_fn=<UnbindBackward0>)]
7 0 1 1 1 [tensor(-0.0025, grad_fn=<UnbindBackward0>)]
8 1 0 0 0 [tensor(0.9992, grad_fn=<UnbindBackward0>)]
9 1 0 0 1 [tensor(-0.2525, grad_fn=<UnbindBackward0>)]
10 1 0 1 0 [tensor(0.9992, grad_fn=<UnbindBackward0>)]
11 1 0 1 1 [tensor(-0.0077, grad_fn=<UnbindBackward0>)]
12 1 1 0 0 [tensor(0.9992, grad_fn=<UnbindBackward0>)]
13 1 1 0 1 [tensor(0.2722, grad_fn=<UnbindBackward0>)]
14 1 1 1 0 [tensor(-0.0066, grad_fn=<UnbindBackward0>)]
15 1 1 1 1 [tensor(0.0033, grad_fn=<UnbindBackward0>)]
Second, the parameters are not as expected.
linear_relu_stack.0.weight tensor([[-0.3637, 0.3838, 0.7624, 0.3661],
[ 0.2857, 0.5719, 0.5721, -0.5846],
[ 0.4782, -0.5035, -0.2349, 1.2070]])
linear_relu_stack.0.bias tensor([-0.7657, -0.8599, -0.4842])
linear_relu_stack.2.weight tensor([[-1.3418, -1.7255, -1.0422]])
linear_relu_stack.2.bias tensor([0.9992])
My question is why the NN doesn't convege to my expected position?
What' the problem?
[1]: https://i.stack.imgur.com/WqaXi.png
[2]: https://i.stack.imgur.com/Z9cQb.png