3

I have a, what I thought, really simple question. In a longitudinal experiment with a group of participants has everyone rated everyone else on, let's say, 10 variables (e.g. "This person is likeable.", "This person is dull." and so on) at 7 different times. If i want to get some sort of perceiver and target variance for one variable/response I'd use:

lmer(scale(Var1) ~ (1|target) + (1|perceiver), data= subset(x, time_point == 1))

Here we have a dependent variable "Var1" of a dataframe "x" with the specification of the 1st time_point (which is also a variable of x).

So far so good, this works just fine.

Now as I said, I have multiple responses and multiple time points. Therefore I wanted to use a) a "for"-loop, or b) lapply, to get all the models at once.

Either way, I have to somehow "index" the dependent variable, be it specifying the column position (x[,10] with 10 being the assumed position of Var1) or the variable itself (x$Var1) or (which is at least a little odd) paste or print the name of the Variable into the formula (col.names(c[10]).

What I am trying to say is, neither of this does work. I always get an error about differing variable lengths. But, as I wrote, I am using the exact same columns!

Does anyone of you have experience with running multiple lmers?

All ideas are welcome and appreciated! I hope I was not too unclear, if you need any further information, I'd be happy to provide, as far as I can.

Cheers, Al

Al_
  • 89
  • 1
  • 7
  • 1
    Please post a reproducible example. Are the variables named `Var1` to `Var7`? Does `time_point` range from `1` to `7`? – Sven Hohenstein Dec 11 '13 at 17:20
  • 2
    1) A single mixed model can easily handle repeated measures - I'm not sure building separate models makes much sense statistically. 2) Showing us the output of `head(x)` would go a long way to helping us answer your question; even better would be `dput(head(x))` so that people can rebuild part of your `data.frame` for checking their code. – Matt Parker Dec 11 '13 at 17:25
  • Why do you want to estimated only one fixed effect (intercept)? – Sven Hohenstein Dec 11 '13 at 17:26
  • Hey Sven, thanks for the comment, the dataset itself is huge, and I am not really (codingswise) able to generate one. The time points are ranging from 1 to 7, yes, and the Variables could be "Var1" to "Var10" (they have other, content-specific names, but that doesn't really matter) in columns 1 to 10. The specification within the lmer-formula is the problem. – Al_ Dec 11 '13 at 17:27
  • So you have less time points then response variables. Do you want all combinations of these two factors, i.e., 70 models? Please specify the models you're looking for. – Sven Hohenstein Dec 11 '13 at 17:29
  • Hey Matt, and also thanks! It is not the multiple time points I am worried about, it is "just" the specification other than the mere Variable name in the lmer-function, i.e. you have to write "Var1 ~ ..." instead of "x[,1] ~ ...". Since the dataset is huge, and I simplified naming, I don't know if this will work well `(dput(head(x))` is huge, too. – Al_ Dec 11 '13 at 17:29
  • Ok, I want to later go on and analyze further models with the extracted random effects, so I want to use each model on its own. E.g. `lmer(scale(Var1) ~ (1|target) + (1|perceiver), data= subset(x, time_point == 1))`and then `lmer(scale(Var2) ~ (1|target) + (1|perceiver), data= subset(x, time_point == 1))` and so on – Al_ Dec 11 '13 at 17:32
  • Ah, I misunderstood - I didn't understand that you have ten different outcome variables. – Matt Parker Dec 11 '13 at 17:32
  • and yes, I want to have a model for each combination, 70 for this example. – Al_ Dec 11 '13 at 17:38

1 Answers1

3

I would try reshaping your data so that each rating has its own record, and then iterate over those:

library(reshape2)


# This will create a data.frame with one row for each  rating, 
# which are uniquely specified by the characteristic being rated,
# the time point, the perceiver, and the target
# (I think)
x.melt <- melt(x,
               id.var = c("time_point", "perceiver", "target"),
               measure.var = c("Var1", "Var2", "Var3", "Var4",
                               "Var5", "Var6", "Var7")
)


# I'd use plyr to iterate, personally
library(plyr)

# This will return a list containing one model for each combination of variable
# (which are your various outcomes) and time_point
x.models <- dlply(x.melt, .var = c("variable", "time_point"), .fun = function(x) {

    lmer(scale(value) ~ (1|target) + (1|perceiver), data= x))

})


# Which then makes it easy to do things like print summaries for every model
lapply(x.models, summary)

I still think it makes more sense to have time_point as a component in your models, in which case you could just remove it from the .var = c("variable", "time_point") argument and add it to the model specification.

In R, many things get a lot easier when the data is in the right shape. It's extremely worthwhile to learn about the "melting" and "casting" concepts behind the reshape2 package - I don't know how I ever got by without them.

Matt Parker
  • 26,709
  • 7
  • 54
  • 72
  • Hi Matt, awesome, thanks, I thought, this would be easier, but that looks good, I'll try it instantly and let you know. Thanks! The problem about time point within the model would be, that I would need it as a third random effect, and I am not quite sure if that is feasible. But anyhow, what I want to do later, is to extract with `ranef` the effects for each subject and analyze those in latent growth models further on. – Al_ Dec 11 '13 at 17:51
  • There are ways to build a model formula from a loop like you're trying to do, but I think this is easier in the long run and so I forgot how to do it. – Matt Parker Dec 11 '13 at 17:54
  • I haven't done much with mixed models in a while, but having the models in a list like this will definitely help with any kind of operations you want to do on the whole set of models. – Matt Parker Dec 11 '13 at 17:55
  • 1
    Finally, there's a StackOverflow for statistics, [stats.stackexchange.com](http://stats.stackexchange.com/), which you should definitely check out for any statistical questions that come up. It's a great site. – Matt Parker Dec 11 '13 at 17:56
  • 1
    +1 I suppose you want to replace `subset(x, time_point == 1)` with `x`. – Sven Hohenstein Dec 11 '13 at 17:57
  • Great idea, cool site, and yes, having them all seperately is cool. Thanks you two! – Al_ Dec 11 '13 at 18:03