0

Reading course notes of Andrew NG's machine learning course it states for linear regression :

Take a training set and pass it into a learning algorithm. The algorithm outputs a function h (the hypothesis). h takes an input and tries to output estimated value y.

It then goes on to say :

present h as : h theta(x) = theta0 + theta1x

Does this not mean the hyptohesis was not outputted by the learning algorithm, instead we just defined it as h theta(x) = theta0 + theta1x

Instead of "Take a training set and pass it into a learning algorithm. The algorithm outputs a function h (the hypothesis)." should the statement be "Take a training set and pass it into a learning algorithm. The algorithm outputs value(s) which make the hypothesis as accurate as possible" ?

blue-sky
  • 51,962
  • 152
  • 427
  • 752
  • `Does this not mean the hyptohesis was not outputted by the learning algorithm, instead we just defined it as h theta(x) = theta0 + theta1x` - what do you mean here? `theta(x) = theta0 + theta1x` is your hypothesis. – cel Jul 19 '15 at 18:27
  • @cel the hypothesis is defined as "h theta(x) = theta0 + theta1x" . The learning algorithm did output this hypothesis as stated in course notes. – blue-sky Jul 19 '15 at 18:29
  • I don't get the difference between the two statements. Returning a function or returning all necessary parameters for a parametric function is basically the same information. – cel Jul 19 '15 at 18:31
  • @cel I meant to say in previous comment "@cel the hypothesis is defined as "h theta(x) = theta0 + theta1x" . The learning algorithm did not output this hypothesis as stated in course notes. " I distinguish between returning a function and returning the evaluation of a function. The learning algorithm appears to return value, not a function. – blue-sky Jul 19 '15 at 18:38
  • I think cel is correct here, but we cannot see your system outputs. It is highly likely that you are getting the *coefficients* of your hypothesis as simply two raw numbers. I assume the task is for you to take the coefficients and build the equation. – roganjosh Jul 19 '15 at 18:41
  • @blue-sky Your further comment came while I was typing mine. The formal answer by Timothy Murphy and cel's answer are correct. Your confusion is regarding coefficients (theta1x) and a constant (theta0) and the fully defined equation. Just plug the numbers in. – roganjosh Jul 19 '15 at 18:43

2 Answers2

1

For the case of linear regression you want your learning algorithm to output a linear fucnction.

That is h(x) = theta0 + theta1x.

In this case the learning algorithm learns the optimal theta0 and theta1 to fit your training data.

If you wanted your learning algorithm to learn a 3rd degree polynomial the output of your learning model would be a, b, c and d such that

h(x) = ax3 + bx2 + cx + d

But your assertion is correct, the learning algorithm chooses the best parameters to minimize the cost of an error function. Usually this is squared error + some regularization factors.

Timothy Murphy
  • 1,322
  • 8
  • 16
1

In principle you are right here. A true learning algorithm as defined in learning theory is an algorithm that gets labelled instances and a whole class of possible hypotheses as input and then chooses one hypothesis as an output.

So strictly speaking, an algorithm that outputs the predictions is not a learning algorithm. But of course such an algorithm can be split into a learning algorithm - the algorithm that actually learns the parameters, here the thetas. and a prediction algorithm that transforms some input instances to our predictions which are then returned to the caller.

cel
  • 30,017
  • 18
  • 97
  • 117
  • I'm not sure that you can make the distinction that you have here. He states in the question that the algorithm is returning the coefficients (well, he says theta values), not predictions. Therefore, in the absence of any code, this could simply be a true learning algorithm with just a set of `print` statements that return the result of learning. I cannot see any suggestion that there's another part that will then take new inputs and use the linear equation for outputs e.g. a prediction algorithm. – roganjosh Jul 19 '15 at 19:03
  • If I understood the question correctly, then the book states, that the algorithm outputs a function, but it actually outputs transformed values - which is the reason for the confusion. In my algorithmic theory, there are no print statements :). But true, if the algorithm outputs the parameters a suitable encoded form, then it is a true learning algorithm. But then again, I don't really understand OP's question. – cel Jul 19 '15 at 19:09
  • 1
    I think the question is simply being slightly pedantic about wording and when your initial comments didn't result in an answer, you've made it more complex. In principle you are completely correct, but I'm not sure the principle applies here. My take on the question: he's suggesting that because the coefficients are returned as raw values, then it is not returning a function. Your initial observation is then correct - they are one and the same, just not presented in the same way, and he thinks the course wording should reflect that – roganjosh Jul 19 '15 at 19:16
  • 1
    @roganjosh, fair enough, but my question now exactly answers the title of the question :P – cel Jul 19 '15 at 19:23
  • Haha, true. I think in the absence of any feedback, we've hit a dead end. There should be more than enough info in the thread at this point to answer the question, whatever that might be in its true form :P I up-voted to reflect answering the title. – roganjosh Jul 19 '15 at 19:28