0

How can I extract the coefficients of an orthogonal polynomial regression in R?

It works like a charm with a raw regression:

#create datas
set.seed(120)
x= 1:20
y=x^2 + rnorm(length(x),0,10)
datas = data.frame(x,y)

#compute raw model with function poly
mod= lm(data=datas, y~poly(x,2,raw=T))

#get coefficients with function coef()
coefficients = coef(mod)

#construct polynom and check fitted values
fitted_values = mod$fitted.values
x0 = datas$x[1]
solution = coefficients[1]+ coefficients[2]*x0^1 + coefficients[3]*x0^2
print(solution)
# 1.001596 
print(fitted_values[1])
# 1.001596
# 1.001596  == 1.001596

But the coefficient obtained with the function coef on an orthogonal lm does not work:

#create datas
set.seed(120)
x= 1:20
y=x^2 + rnorm(length(x),0,10)
datas = data.frame(x,y)

#compute raw model with function poly
mod = lm(data=datas, y~poly(x,2,raw=F))

#get coefficients with function coef()
coefficients = coef(mod)
fitted_values = mod$fitted.values

#construct polynom and check fitted values
x0 = datas$x[1]
solution = coefficients[1]+ coefficients[2]*x0^1 + coefficients[3]*x0^2
print(solution)
# 805.8476 
print(fitted_values[1])
# 1.001596
# 1.001596 != 805.8476 

Is there another way to get the right parameters to construct a polynom of the form c0 + c1 * x + .. + cn * x^n and use it to solve or predict?

I need to solve the equation, meaning getting the x given a y with the function base::solve.

Thank you

jakzr
  • 129
  • 1
  • 1
  • 11

1 Answers1

0

The issue is that you don't only need the coefficients but also the orthogonal polynomials (instead of the raw polynomials you are trying to use). The latter are constructed by model.matrix:

newdata <- model.matrix(~ poly(x, 2), data = datas)
solution <- newdata %*% coefficients
print(solution[1])
# [1] 1.001596 
print(fitted_values[1])
#1.001596

I don't understand the connection to solve but trust you can take it from here.

Roland
  • 127,288
  • 10
  • 191
  • 288
  • I need to solve f(x) = 1.001596 and find x=1 with the model of my example. Sometimes the raw regression gives NA for the x^2 coefficient while the orthogonal regression works. But i can't solve an orthogonal regression with the base::solve() function. – jakzr Sep 21 '20 at 15:02
  • Why are you using `solve` for this? https://en.wikipedia.org/wiki/Quadratic_equation#Solving_the_quadratic_equation – Roland Sep 22 '20 at 05:54
  • I used ```solve``` because it works with a list of non-orthogonal polynomial coefficient but does not fit for orthogonal polynomial. I either need to find the corresponding raw polynom of an orthogonal polynom or find a way to solve f(x) = y with an orthogonal polynom. – jakzr Sep 23 '20 at 12:38
  • Maybe i'm just missing the mathematical way do to so. I'm gonna check your link – jakzr Sep 23 '20 at 12:52
  • I don't understand what you are doing with `solve`. I don't see how you are deriving `x` for a given `y` from it. It solves `a %*% x = b` for x. But we have `X %*% beta = y` – Roland Sep 23 '20 at 12:52
  • Here is how i used ```solve``` when I could retrieve coefficients:```polyn=polynom::polynomial(c(0,0,1))``` creates the x^2 polynome object. Then ```solve(polyn,1)``` solve the equation x^2 = 1 and return -1 et 1. But if i can solve an orthogonal polynom without ```solve``` that's fine too. I was not used to matrix multiplication operators in R, maybe that's what I'm missing here. – jakzr Sep 23 '20 at 13:00
  • I still don't get it. Are you not doing linear regression? Linear regression would solve `X %*% beta = y` for beta if there was no uncertainty. Since there is uncertainty, it minimizes `sum((X %*% beta - y) ^ 2)`. What do you need `solve` for if you do regression? – Roland Sep 23 '20 at 13:06
  • Sorry I might not be clear enough i my question i did not want to over explain, i'm doing for example a regression on a dataset that has a range of x = [10,20] and a range of y = [100,200]. I obtain a linear model which is an orthogonal polynomial regression, my question is "given the lm I just obtained, which is an orthogonal polynomial of degree 2 or more, what x gives me y=350 ?". ```solve``` did the job for a raw polynom, but not for an orthogonal polynom. I wrote a dichotomic function in R to find the solutions but it's slow and there is probably another faster way that I don't know. – jakzr Sep 23 '20 at 13:52
  • 1
    Would a numeric approach be sufficient? `uniroot(function(x) predict(mod, newdata = list(x = x)) - y, c(0, 100))` This finds only one root and, as you know, there are usually two. Adjust the interval to find both. A plot is useful for this. – Roland Sep 24 '20 at 06:17