0

I get confused with the activation functions. why we widely use the Relu function also at the end it's mapping will be a line? Using the sigmoid and tanh make the decision boundary to be squiggle which will fit the data well but, relu map a line( aW+b) to a line also? how this will fit the data better?

aya
  • 33
  • 4

0 Answers0