7

I have a neural network with N input nodes and N output nodes, and possibly multiple hidden layers and recurrences in it but let's forget about those first. The goal of the neural network is to learn an N-dimensional variable Y*, given N-dimensional value X. Let's say the output of the neural network is Y, which should be close to Y* after learning. My question is: is it possible to get the inverse of the neural network for the output Y*? That is, how do I get the value X* that would yield Y* when put in the neural network? (or something close to it)

A major part of the problem is that N is very large, typically in the order of 10000 or 100000, but if anyone knows how to solve this for small networks with no recurrences or hidden layers that might already be helpful. Thank you.

user2118903
  • 401
  • 1
  • 4
  • 8

5 Answers5

3

If you can choose the neural network such that the number of nodes in each layer is the same, and the weight matrix is non-singular, and the transfer function is invertible (e.g. leaky relu), then the function will be invertible.

This kind of neural network is simply a composition of matrix multiplication, addition of bias and transfer function. To invert, you'll just need to apply the inverse of each operation in the reverse order. I.e. take the output, apply the inverse transfer function, multiply it by the inverse of the last weight matrix, minus the bias, apply the inverse transfer function, multiply it by the inverse of the second to last weight matrix, and so on and so forth.

zenna
  • 9,006
  • 12
  • 73
  • 101
2

This is a task that maybe can be solved with autoencoders. You also might be interested in generative models like Restricted Boltzmann Machines (RBMs) that can be stacked to form Deep Belief Networks (DBNs). RBMs build an internal model h of the data v that can be used to reconstruct v. In DBNs, h of the first layer will be v of the second layer and so on.

alfa
  • 3,058
  • 3
  • 25
  • 36
  • An auto-encoder reproduces its input, whereas the question here is to find the inverse function of a network. Perhaps you mean adding a mirror image of the original network, and train the composite network end-to-end as an auto-encoder? – Yan King Yin Jun 12 '18 at 18:02
1

zenna is right. If you are using bijective (invertible) activation functions you can invert layer by layer, subtract the bias and take the pseudoinverse (if you have the same number of neurons per every layer this is also the exact inverse, under some mild regularity conditions). To repeat the conditions: dim(X)==dim(Y)==dim(layer_i), det(Wi) not = 0

An example: Y = tanh( W2*tanh( W1*X + b1 ) + b2 ) X = W1p*( tanh^-1( W2p*(tanh^-1(Y) - b2) ) -b1 ), where W2p and W1p represent the pseudoinverse matrices of W2 and W1 respectively.

userčina
  • 21
  • 2
  • 1
    Do you have any details about the numerical stability of such an approach? It seems to me like it could go horribly wrong. –  Jan 31 '17 at 04:11
  • Yes it could, e.g. if |W2p*(tanh^-1(Y) - b2) |>1 the function argument would be out of tanh^-1(\cdot) domain. – userčina Jan 31 '17 at 12:30
  • but that would not happen because the output of sigmoid or tanh is bound within [0,1] or [-1,1] – Yan King Yin Jun 12 '18 at 17:49
  • In fact, if the weight matrices are non-singular, the inverse is unique. In practice, however, small disturbances (in the "output") lead to huge fluctuations (in the "input"). This is characteristic of chaos. – Yan King Yin Jun 13 '18 at 06:58
1

The following paper is a case study in inverting a function learned from Neural Networks. It is a case study from the industry and looks a good beginning for understanding how to go about setting up the problem.

Abhinav
  • 1,882
  • 1
  • 18
  • 34
  • It's an undergrad paper and I don't know how the undergrad student got interested in a steel industry problem unless, very likely, the professor has "primed" him on it... – Yan King Yin Jun 13 '18 at 16:59
  • (not that it's wrong... it's perfectly fine as applied math) – Yan King Yin Jun 14 '18 at 05:30
  • here is a paper with more in-depth analysis: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.17.5892&rep=rep1&type=pdf – Yan King Yin Jun 14 '18 at 05:31
1

An alternate way of approaching the task of getting the desired x that yields desired y would be start with random x (or input as seed), then through gradient decent (similar algorithm to back propagation, difference being that instead of finding derivatives of weights and biases, you find derivatives of x. Also, mini batching is not needed.) repeatedly adjust x until it yields a y that is close to the desired y. This approach has an advantage that it allows an input of a seed (starting x, if not randomly selected). Also, I have a hypothesis that the final x will have some similarity to initial x(seed), which would imply that this algorithm has the ability to transpose, depending on the context of the neural network application.

Toll
  • 11
  • 1