I use a neural network with 3 layers for categorization problem: 1) ~2k neurons 2) ~2k neurons 3) 20 neurons. My training set consists of 2 examples, most of the inputs in each example are zeros. For some reason after the backpropagation training the network gives virtually the same output for both examples (which is either valid for only 1 of examples or have 1.0 for outputs where one of example has 1s). It comes to this state after the first epoch and doesn't change much afterwards, even if learning rate is minimal double vale. I use sigmoid as activation function. I thought it could be something wrong with my code so I've used AForge open source library, and seems like it suffers from the same issue. What might be the problem here?
Solution: I've removed one layer and decreased the number of neurons in hidden layer to 800