Use neural netword to fit a reduced boolean function, but found the super-params not as expected

Question

This is the boolean function I try to fit. [boolean function description][1]

Theoretically we need a neural network with 1 hidden layer which has 3 neurons at least.
And that's actually how I built the neural network in Pytorch.
However, despite the prediction of NN usually correct, the parameters (I mean the weights and bias) not as expected.
I expect the parameters to be like this way(A perceptron operation is equivalent to a Boolean gate):
[perceptron equivalent to a boolean gate][2]

Here the key code:

class NeuralNetwork(nn.Module):
def __init__(self):
    super(NeuralNetwork, self).__init__()
    self.linear_relu_stack = nn.Sequential(
        nn.Linear(4, 3),
        nn.ReLU(),
        nn.Linear(3, 1),
    )

def forward(self, x):
    logits = self.linear_relu_stack(x)
    return logits

base_lr = 0.001
optimizer = torch.optim.Adam(model.parameters(), base_lr)
criterion = nn.MSELoss().to(device)

Here is the key output:

First, the prediction not so bad, The categories are correct, but some of the numbers are not precise enough

        w   x   y   z   pred
0   0   0   0   0   [tensor(0.9992, grad_fn=<UnbindBackward0>)]
1   0   0   0   1   [tensor(0.2459, grad_fn=<UnbindBackward0>)]
2   0   0   1   0   [tensor(0.9992, grad_fn=<UnbindBackward0>)]
3   0   0   1   1   [tensor(0.0040, grad_fn=<UnbindBackward0>)]
4   0   1   0   0   [tensor(0.9992, grad_fn=<UnbindBackward0>)]
5   0   1   0   1   [tensor(0.7707, grad_fn=<UnbindBackward0>)]
6   0   1   1   0   [tensor(-0.0015, grad_fn=<UnbindBackward0>)]
7   0   1   1   1   [tensor(-0.0025, grad_fn=<UnbindBackward0>)]
8   1   0   0   0   [tensor(0.9992, grad_fn=<UnbindBackward0>)]
9   1   0   0   1   [tensor(-0.2525, grad_fn=<UnbindBackward0>)]
10  1   0   1   0   [tensor(0.9992, grad_fn=<UnbindBackward0>)]
11  1   0   1   1   [tensor(-0.0077, grad_fn=<UnbindBackward0>)]
12  1   1   0   0   [tensor(0.9992, grad_fn=<UnbindBackward0>)]
13  1   1   0   1   [tensor(0.2722, grad_fn=<UnbindBackward0>)]
14  1   1   1   0   [tensor(-0.0066, grad_fn=<UnbindBackward0>)]
15  1   1   1   1   [tensor(0.0033, grad_fn=<UnbindBackward0>)]

Second, the parameters are not as expected.

    linear_relu_stack.0.weight tensor([[-0.3637,  0.3838,  0.7624,  0.3661],
        [ 0.2857,  0.5719,  0.5721, -0.5846],
        [ 0.4782, -0.5035, -0.2349,  1.2070]])
linear_relu_stack.0.bias tensor([-0.7657, -0.8599, -0.4842])
linear_relu_stack.2.weight tensor([[-1.3418, -1.7255, -1.0422]])
linear_relu_stack.2.bias tensor([0.9992])

My question is why the NN doesn't convege to my expected position?
What' the problem? [1]: https://i.stack.imgur.com/WqaXi.png [2]: https://i.stack.imgur.com/Z9cQb.png

Generally speaking, there's no guarantee that a NN will solve a problem as you expect. Putting that aside I can't reproduce your outputs with the weights you've posted. Are you sure you have the right weights? — Bahman Rouhani, Dec 15 '21 at 09:47

score 0 · Answer 1 · answered Dec 15 '21 at 09:45

0

Generally you can't expect a neural network to solve a problem the same way you do.

Firstly, the network starts from a random state and proceeds by optimizing that state (e.g. you have used Adam in your code) which means there is no guarantee of what state will the code eventually land in.

Secondly, if your results are [as you stated in your question] more or less correct, then your network has found a reduction for your function, it's just that this reduction might not make sense to you, specially if you're trying to understand it in terms of logical functions which is not very close to how Neural Networks works.

Not being interpretable is a known down-side of Neural Networks and this is case is no exception.

answered Dec 15 '21 at 09:45

Bahman Rouhani

1,139
2
14
33

Thank you very much!!! Maybe I should write a nn on my own to see how it actually works....And what I'm still want to know is the REDUCTION. Do you know are there any articles discuss about it? what's more, logical functions seem to be the "best" function NN should be convege to. So the result in my question actually means Adam converge to a local optimum. Maybe I can use some generic algorithms to optimize weights of NN. – Richard Frank Dec 22 '21 at 15:33
@RichardFrank No problem. :) writing a simple Neural Network from scratch can tiresome but if you want to study how NNs exactly it could be helpful. It is true that Adam has converged to a local optimum in this case. If you want to reproduce an example where NN achieves reduction, you might be able to reproduce it by initializing your weights near global optimum, or just run the algorithm a large number of times and save the best results. There's one more thing, are you sure a single neuron can converge to functions that you use in your reduction (e.g. a logical AND on 3 variables)? – Bahman Rouhani Dec 23 '21 at 06:39
1

Yeah, actually the hidden layer has 3 neurons, and if you see the images here([1]: https://i.stack.imgur.com/WqaXi.png [2]: https://i.stack.imgur.com/Z9cQb.png), you'll find it true. – Richard Frank Dec 25 '21 at 06:47
Seems right. In that case, initializing the weights might get you what you want. Or just repeating the experiment over and over. – Bahman Rouhani Dec 25 '21 at 07:11

Use neural netword to fit a reduced boolean function, but found the super-params not as expected

1 Answers1