2

I have age as a covariate in my material. A continuous variable. The age varies between 18-70 years.

I'm into a logistic regression and have decided to represent age as a polynomial.

In the data I have 4021 observations so then I just copied a piece of for you to see what they look like:

head(both)

   gender       passinggrade age    prog
1    man          FALSE      69     FRIST
2    man             NA      70     FRIST
3 woman             NA       65     FRIST
4 woman           TRUE       68      FRIST
5 woman             NA       65     NMFIK
6    man          FALSE      70     FRIST

my model;

mod.fit<-glm(passinggrade ~prog+gender+age,family=binomial,data=both)

summary(mod.fit)

So what I'm wondering is, how should I do to treat age as a polynomial? Do not know if I need to change something in my R code? Have not done anything in R to 'make age a polynomial', my question is quite simple; how do you do it?

malin
  • 37
  • 1
  • 4
  • 1
    Many examples of using `poly` exist on SO as well as warnings not to use `I(x^2)`. There's also a 'polynomial' package but that should not be needed here. – IRTFM Apr 08 '15 at 18:46

1 Answers1

6

You can do this a few a different ways:

glm(passinggrade ~ prog + gender + poly(age, 3), ...

# Less preferred...
glm(passinggrade ~ prog + gender + age + I(age^2) + I(age^3), ...
glm(passinggrade ~ prog + gender + cbind(age, age^2, age^3), ...

See this post for more information and discussion.

Community
  • 1
  • 1
JasonAizkalns
  • 20,243
  • 8
  • 57
  • 116
  • I think you should edit your answer to deprecate the last two. – IRTFM Apr 08 '15 at 18:47
  • 2
    @BondedDust I'd rather post to examples/discussions as to why to avoid the last two for others to learn...especially since `I(x^2)` is still prevalent in many examples. – JasonAizkalns Apr 08 '15 at 18:49
  • How will I know the degrees of polynomials I should have? Regression of polynomials? Or should I just try at adding in the GLM and then quit when they are not significant? If I add a polynomial of degree 4, I get all significant but if I take the grade 5, they are not signifikanr ... how can you do? @BondedDust – malin Apr 09 '15 at 08:25
  • @JasonAizkalns How will I know the degrees of polynomials I should have? Regression of polynomials? Or should I just try at adding in the GLM and then quit when they are not significant? If I add a polynomial of degree 4, I get all significant but if I take the grade 5, they are not signifikanr ... how can you do? – malin Apr 09 '15 at 08:26
  • You should probably be using a method that takes into account the fact that you are probably performing some sort of data-dredging when you do this. There are several methods that have been developed that attempt to offer statistically principled, automated approaches to this. The `mgcv` package is one fairly well-respected approach to automated polynomial regression. Looking at the p-values of indivdual terms is not the right way. If you stick with the "manual" process, then model comparisons of deviance using penalties for the multiple comparisons in assessing significance is better. – IRTFM Apr 09 '15 at 14:55
  • @malin Because you are new on SO, please read [**about Stackoverflow**](http://stackoverflow.com/about), [**what to do when someone answers**](http://stackoverflow.com/help/someone-answers), [about **voting**](http://stackoverflow.com/help/why-vote) and [about **accepting** answers](http://meta.stackoverflow.com/a/5235). – Henrik Apr 12 '15 at 12:32