0

i'm curious as to the kind of limitations even an expertly designed network might have. this one in particular is what i could use some insight on:

given:

a set of random integers of non-trivial size (say at least 500)

an expertly created/trained neural network.

task:

number anagram: create the largest representation of an infinite sequence of integers possible in a given time frame where the sequence either can be represented in closed form (ie - n^2, 2x+5, etc) or is registered in OEIS (http://oeis.org/). the numbers used to create the sequence can be taken from the input set in any order. so if the network is fed (3, 5, 1, 7...), returning (1, 3, 5, 7 ...) would be an acceptable result.

it's my understanding that an ANN can be trained to look for a particular sequence pattern (again - n^2, 2x+5, etc). what I'm wondering is if it can be made to recognize a more general pattern like n^y or xy+z. my thinking is that it won't be able to, because n^y can produce sequences that look different enough from one another that a stable 'base pattern' can't be established. that is - intrinsic to the way ANNs work (taking sets of input and doing fuzzy-matching against a static pattern it's been trained to look for) is that they are limited in terms of scope of what it is they can be trained to look for.

have i got this right?

Community
  • 1
  • 1
snerd
  • 1,238
  • 1
  • 14
  • 28
  • I'm not 100% clear what your trying to do with them, but a good intuition is to think of NNs as fuzzy digital circuits. If you can make a digital circuit that can do your task, you can make an NN that can also do it. NNs can also take analog values as inputs, but they are limited to doing simple multiplications by constants and sums. Nonlinearities are be used approximate more complicated functions in a peicewise manner. – Houshalter Apr 03 '15 at 07:34
  • @Houshalter thx for getting back to me! what I'm trying to understand is the limits of a NNs ability to recognize patterns qua patterns. I know that a NN can be trained to recognize a particular pattern, like x^2 for example. So for every run, if there exists a sequence that is x^2 - the NN will probably report it. If each run has different sequences, say x^2 for the first, 2x * 5 for the second, xlogx for the third, etc, then a more complicated NN is required but it could be trained to find these things. what I'm looking to confirm is my understanding that: – snerd Apr 03 '15 at 16:04
  • 1) the NN will not be able to recognize a pattern it hasn't been explicitly told to look for. 2) the NN can't recognize a more general form of a pattern is was trained to find. So while it can be trained to look for x^2, it can't ever be trained to look for x^y. – snerd Apr 03 '15 at 16:05
  • 1
    Think of NNs as "function approximators". They can be trained to produce specific outputs given specific inputs. How good they are at approximating a function depends on a lot of things, but they seem to do most simple continuous functions fairly well with enough hidden units and training examples. I would not expect it to recognize a pattern outside of what it has been trained on. They interpolate not extrapolate. – Houshalter Apr 10 '15 at 04:43
  • @Houshalter do you happen to have any citations containing hard data? Also - when you say 'simple continuous functions' - can you give me an idea of the difference between 'simple' and 'hard' as you mean it? In my given example, would it be able to discern x^y or perhaps the formula for calculating pi? – snerd Apr 10 '15 at 21:21
  • "they interpolate not extrapolate" -- in other words - they can do a reasonably good job of telling you whether or not a given set of integers is a representation of a function it's been trained to recognize, but they can't themselves continue the sequence. have I got it? – snerd Apr 10 '15 at 21:24
  • 1
    Here is a good explanation of how neural networks can approximate functions: http://neuralnetworksanddeeplearning.com/chap4.html TBH I haven't experimented with, or researched it enough to know exactly what functions they are good at approximating vs other functions. I just don't expect it to perform well on data it wasn't trained on, or that can't be done with piecewise linear functions. – Houshalter Apr 11 '15 at 05:04
  • you seem to want to train it to recognize whether or not a sequence of numbers belongs to a class of functions? Are you feeding it into a recurrent neural network? Do you have a set of training data of numbers from the functions you want it to identify? – Houshalter Apr 11 '15 at 05:06
  • wowzers that's a great link, t/y! as for what I'm looking to do - I'm endeavoring to train a population of artificial life agents, each driven by a neural network, to recognize patterns in a data space that are deemed useful to the user. i'm not even sure if this is possible yet, but what I really want is to say 'go look for interesting patterns', not 'go find instances of this particular pattern'. the number anagram example I gave earlier is an idea i have for an experiment to test the veracity of the idea. – snerd Apr 11 '15 at 18:53

1 Answers1

1

Continuing from the conversation I had with you in the comments:

Neural networks still might be useful. Instead of training a neural net to search for a single pattern, the neural net can be trained to predict the data. If the data contains a predictable pattern, the NN can learn it, and the weights of the NN will represent the pattern it has learned. I think that may be what you were intending to do.

Some things that might be helpful for you if you do this:

Autoencoders do unsupervised learning and can learn the structure of individual datapoints.

Recurrent Neural Networks can model sequences of data rather than just individual datapoints. This sounds more like what you are looking for.

A Compositional Pattern-Producing Network (CPPNs) is a really fancy word for a neural network with mathematical functions as activation functions. This would allow you to model functions that aren't easily approximated by NNs with simple activation functions like sigmoids or ReLU. But usually this isn't necessary, so don't worry to much about it until after you have a simple NN working.

Dropout is a simple technique where you remove half of the hidden units every iteration. This seems to seriously reduce overfitting. It prevents complicated relationships between neurons from forming, which should make the models more interpretable, which seems like your goal.

Houshalter
  • 2,508
  • 1
  • 18
  • 20