0

I am looking to identify differential gene expression with age. I would like to do this with a linear mixed model with specifying the age as continuous covariate. The expression of the genes are from different tissues. Tissue acts as a fixed effect and individuals as a random effect.

The imported dataset contains age as a string with a values as a range. For eg: 20-29 years. I tried to use the mean of upper and lower limits as the value for the age in a num class. When I extract the p-value for Age after doing ANOVA, it is not the same as I want it to be from a previous reported study. I am not sure if I can use a num class or is there any other way of specifying a continuous covariate in R.

Thank you

  • 2
    If age is coded as a category, I don't see how you can treat it as a continuous variable - doing so may significantly change the results. – User7598 May 02 '16 at 16:43
  • Maybe the other value was not correct. You shouldn't be "wanting" analyses of experiments to be a particular way. It is useful to challenge existing knowledge if new data is not confirmatory of earlier resutls. (But it does appear that you need statistical consultation.) – IRTFM May 02 '16 at 17:13
  • using the mean of the category (i.e. assuming a uniform distribution within the category), or doing something a bit fancier such as suggested by @r.kaiza below, is probably the bets you can do. – Ben Bolker May 25 '16 at 20:33

1 Answers1

0

Look at the frequency distribution of the categorical variable. Try to match it to a probability distribution, and then extract ages randomly from that distribution with probability intervals equal to the relative density of each bin.

k.

r.kaiza
  • 145
  • 8