1

I had to transform a variable response (e.g. Variable 1) to fulfil the assumptions of linear models in lmer using an approach suggested here https://www.r-bloggers.com/2020/01/a-guide-to-data-transformation/ for heavy-tailed data and demonstrated below:

TransformVariable1 <- sqrt(abs(Variable1 - median(Variable1))

I then fit the data to the following example model:

fit <- lmer(TransformVariable1 ~ x + y + (1|z), data = dataframe) 

Next, I update the reference grid to account for the transformation as suggested here Specifying that model is logit transformed to plot backtransformed trends:

rg <- update(ref_grid(fit), tran = "TransformVariable1")

Neverthess, the emmeans are not back transformed to the original scale after using the following command:

fitemm <- as.data.frame(emmeans(rg, ~ x + y, type = "response"))

My question is: How can I back transform the emmeans to the original scale?

Thank you in advance.

Progman
  • 16,827
  • 6
  • 33
  • 48

1 Answers1

0

There are two major problems here.

The lesser of them is in specifying tran. You need to either specify one of a handful of known transformations, such as "log", or a list with the needed functions to undo the transformation and implement the delta method. See the help for make.link, make.tran, and vignette("transformations", "emmeans").

The much more serious issue is that the transformation used here is not a monotone function, so it is impossible to back-transform the results. Each transformed response value corresponds to two possible values on either side of the median of the original variable. The model we have here does not estimate effects on the given variable, but rather effects on the dispersion of that variable. It's like trying to use the speedometer as a substitute for a navigation system.

I would suggest using a different model, or at least a different response variable.

A possible remedy

Looking again at this, I wonder if what was meant was the symmetric square-root transformation -- what is shown multiplied by sign(Variable1 - median(Variable1)). This transformation is available in emmeans::make.tran(). You will need to re-fit the model.

What I suggest is creating the transformation object first, then using it throughout:

require(lme4)
requre(emmeans)

symsqrt <- make.tran("sympower", param = c(0.5, median(Variable1)))

fit <- with(symsqrt, 
    lmer(linkfun(Variable1) ~ x + y + (1|z), data = dataframe)
)

emmeans(fit, ~ x + y, type = "response")

symsqrt comprises a list of functions needed to implement the transformation. The transformation itself is symsqrt$linkfun, and the emmeans package knows to look for the other stuff when the response transformation is named linkfun.

BTW, please break the habit of wrapping emmeans() in as.data.frame(). That renders invisible some important annotations, and also disables the possibility of following up with contrasts and comparisons. If you think you want to see more precision than is shown, you can precede the call with emm_options(opt.digits = FALSE); but really, you are kidding yourself if you think those extra digits give you useful information.

Russ Lenth
  • 5,922
  • 2
  • 13
  • 21
  • Thank you very much @RussLenth, the solution you suggested worked well to transform the data and to back-transform the emmeans – Ana Carolina Jun 14 '22 at 06:11