Can a neural network learn to act as a multiplexer?

Question

Let's say my inputs are, A and B. The inputs are roughly like this, A = [10, 5, 30, 2] which can have arbitrary values in the range [1,100] and B = [0, 1, 0, 0] which is an one hot vector. The expected output, C, is = [5], which is the dot product of the two input vectors, C = A.B

Similarly, for A = [10, 5, 30, 2] and B = [0, 0, 1, 0], the output will be C = [30].

Essentially, I want the neural network to act as a 4-way multiplexer (https://en.wikipedia.org/wiki/Multiplexer).

I have implemented a neural network with two hidden layers. While it works on training data, it fails to generalize beyond that.

Is there an underlying reason that this problem will be difficult for a Neural Network?

I have a synthetic data generator. I have tried up-to 25600 training examples. — KuttuKutu, Nov 06 '19 at 01:30
Are you limited to classic neural networks, or interested in a practical solution in TF? — Marat, Nov 06 '19 at 01:38
Well, I have only tried fully connected neural networks with multiple hidden layers. But I am open to other practical solutions. — KuttuKutu, Nov 06 '19 at 02:09

score 0 · Answer 1 · answered Nov 06 '19 at 01:43

According to the Universal approximation theorem, fully connected neural networks with a single hidden layer can be "practically" universal approximators (given a series of conditions and considerations).

More of that here: https://en.wikipedia.org/wiki/Universal_approximation_theorem

So, yes, the network can indeed approximate a multiplexor.You have to take into account a few factors. Maybe you can try standardizing or normalizing your input data (input data in different scales can disrupt the network learning process), you can find some information here:

https://stats.stackexchange.com/questions/10289/whats-the-difference-between-normalization-and-standardization

Also, take a look at your input space, you have 100^4 times 4 possible inputs (around 4 x 10^8, based on that, you have to consider the size of your training data, because a few thousand examples won't make the cut, because the data is very disperse ( the examples in the training sample can be very different from the ones in the validation data).

score 0 · Answer 2 · answered Nov 06 '19 at 14:43

Basically, you're looking for A.T * B (in numpy terms). Juan's answer is right and wrong at the same time. Neural networks don't do interactions (just check the math), so there is no "natural" representation of this formula. It can be approximated, though, if you have a substantially complex architecture.

Without neural networks, with bare Tf, it is just tf.math.reduce_sum(A * B).

Example:

>>> A = tf.constant([10, 5, 30, 2])                                                                                             
>>> B = tf.constant([0, 1, 0, 0]) 
>>> with  tf.Session() as sess: print(tf.math.reduce_sum(A * B).eval())
5

Can a neural network learn to act as a multiplexer?

2 Answers2