Let's say I have a response variable which is not normally distributed and an explanatory variable. Let's create these two variables first (coded in R):
set.seed(12)
resp = (rnorm(120)+20)^3.79
expl = rep(c(1,2,3,4),30)
I run a linear model and I realize that the residuals are not normally distributed. (I know running a Shapiro might not be enough to justify that the residuals are not normally distributed but it is not the point of my question)
m1=lm(resp~expl)
shapiro.test(residuals(m1))
0.01794
Therefore I want to transform my explanatory variable (looking for a transformation with a Box-Cox for example).
m2=lm(resp^(1/3.79)~expl)
shapiro.test(residuals(m2))
0.4945
Ok, now my residuals are normally distributed it is fine! I now want to make a graphical representation of my data and my model. But I do not want to plot my explanatory variable in the transformed form because I would lose lots of its intuitive meaning. Therefore I do:
plot(x=expl,y=resp)
What if I now want to add the model? I could do this
abline(m2) # m2 is the model with transformed variable
but of course the line does not fit the data represented. I could do this:
abline(m1) # m1 is the model with the original variable.
but it is not the model I ran for the statistics! How can I re-transform the line predicted by m2
so that it fits the data?