0

enter image description hereenter image description hereI would like to compute a non linear boundary with sigmoid neurons with an input layer and output layer. The neuron has 2 inputs x1, x2 and a bias. I am trying to compute this.

How is this done. For the perceptron if

         w*x +b >= 0 for negative samples then
         we perform w = w -x[![enter image description here][2]][2]
         and if w*x+b <0 for positive samples then
         w = w + x

till the errors reduce to a low value. I am using Octave for this computation.

Is there an iterative method for sigmoid neurons. How do we get the non-linear boundary?

Tinniam V. Ganesh
  • 1,979
  • 6
  • 26
  • 51
  • `sigmoid(b0 + x1*w1 + x2*w2)` where `sigmoid(x) = 1/(1+e^(-x))` is the activation of a sigmoid neuron with two inputs `w1`, `w2` and a bias `b0`. The code snip you posted appears to be (crudely) adjusting the weight to /train/ the perceptron. Are you asking how to train a neural net? Or how to compute the activation? (What do you mean by "boundary"? Are you trying to solve a classification problem?) – BadZen Nov 12 '16 at 16:23
  • Yes the above code is elementary. I wanted to simulate an OR function as shown above where 0 0 -> 0, 0 1-> 1, 1 0 -> 1 & 1,1, -> 1. Is it possible to draw a non linear boundary as shown above – Tinniam V. Ganesh Nov 12 '16 at 16:34
  • @BadZen can you tell me how the neuron would adjust based on your computation with sigmoid function? – Tinniam V. Ganesh Nov 12 '16 at 16:35
  • Typically for multilayer nets one uses an algorithm called backpropagation. See https://en.wikipedia.org/wiki/Backpropagation – BadZen Nov 12 '16 at 16:37
  • The reason I ask is I see examples as above of simple OR, AND and NOT gates in many books. I am not sure how the weights are computed. The above picture does not seem to use back propagation. – Tinniam V. Ganesh Nov 12 '16 at 16:45
  • Why do you think that backprop was not used to generate the weights in the classification net corresponding to that boundary? It's just a picture of the /result/. – BadZen Nov 12 '16 at 16:47
  • Is this question about how to determine the weights of a logic gate implemented with a single sigmoid neuron, or about classifying data in a plane with a multilayer net? The pictures you just added don't seem to be about the same thing. – BadZen Nov 12 '16 at 16:50
  • (In particular notice that those logic gate boundaries are totally linear. Also note that the sigmoid function is sign-preserving, so your zero boundary will not change moving from `f(x)` to `sigmoid(f(x))`...) – BadZen Nov 12 '16 at 16:53
  • @BadZen thanks for your comments. Maybe I misunderstood. To me the OR gate is non-linear i.e. when x1,x2 < k then result is 0 and when x1 or x2 or both x1 & x2 >= k then the result is 1, which is what the non-linear boundary above shows. However you mention that the logic gates are linear. Need to think about this. – Tinniam V. Ganesh Nov 12 '16 at 17:17

1 Answers1

1

There are two parts of this question, one related to plotting, and one to nets themselves. Lets start with the second part, you need to understand that:

  • a single neuron, without any activation or with sigmoid on it is a linear model. In order to have nonlinearity you either need non-monotonic activations (like rbf) or at least 1 hidden layer.
  • some logic gates are linear, and some are not. In particular OR is linear (as well as AND), but at the same time XOR is not. The proof is really simple (for linearity of OR) since it can be implemented as

    cl(x) = x1 + x2 - 0.5
    

    if you now take the sign of the above equation you will see that it is 1 iff x1+x2>0.5, and obviously this happens (among other cases) when at least one is 1 and other is 0.

In terms of decision boundaries. For linear models it is straight forward because one can determine analyticaly the decision boundary, however if one has nonlinear model in general it is not possible. Thus what we do is an approximation, you want to plot decision boundary on a plane, for x1 e [-T, T] and x2 e [-T, T] so what you do - you just sample very densily points from the inputs space (like (-T, -T), (-T+0.01, -T+0.01), ...) and check the classification. You get a huge matrix of 0 and 1 and you just plot a countur plot of this function.

lejlot
  • 64,777
  • 8
  • 131
  • 164