Compute z-score by two groups

Question

I have a repeated measures data-set that I am working on. The data look like this:

ID=c('X1', 'X1', 'X1', 'X1', 'X2', 'X2', 'X2', 'X3', 'X3', 'X3', 'X3', 'X4', 'X4', 'X4', 'X4', 'X5', 'X5', 'X5', 'X6', 'X6', 'X6', 'X6')
Diag=c('Con', 'Con', 'Con', 'Con', 'Con', 'Con', 'Con', 'AD', 'AD', 'AD', 'AD', 'AD', 'AD', 'AD', 'AD', 'FD', 'FD', 'FD', 'FD', 'FD','FD', 'FD')
Score=c(10, 9, 8, 8, 10, 9, 9, 5, 4, 3, 3, 5, 4, 3, 2, 5, 4, 3, 6, 5, 4, 3)
Time=c(1, 2, 3, 4, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 1, 2, 3, 4)

dat <- data.frame(ID, Time, Diag, Score)

where ID= participant ID, diag=diagnosis, time=repeat assessment, score=score

I want to calculate the z-scores for AD and FD group scores, in relation to 'Con' group scores (the control group) and as per 'Time'. For eg., z-scores for AD and FD groups at Time 2 should be computed in relation to control groups at Time 2 (ignoring for missing variables)

The only way I can think of is splitting the data frame into different 'Times', calculating z-scores and merging the data-frame together but this is quite painful and time-taking (and is giving me NAs sometimes when I attempted to merge the data frame).

Is there a better way of doing this?

I guess the ID is not important here, right?. Also, based on the example showed, there is only 1 observation per each Time for Diag Con Vs AD or Diag Con Vs FD. — akrun, Jul 30 '15 at 06:57
It is two observations, I noticed it now. But, my doubt is for the calculation of `z-score` do you use the `mean` and `variance` from the `Con` group — akrun, Jul 30 '15 at 07:22
I first subsetted the data-set and then calculated the z-score (using plyr) `subset(dat, Time==1)->dat.1 dat.1$k<-ddply(dat.1, (.dat.1$Diag), summarize, z_score=(dat.1$Score-mean(dat.1$Score)/sd(dat.1$Score)))` For some reason, it doesn't give me the result now. Says 'object dat.1 not found'. I think this is code isn't doing what I intend it to do though - I want it to compute the z-score in relation to the `Con` group — Sid0311, Jul 30 '15 at 07:23
The code is incorrect. Another problem is that you are not comparing with the control group i.e. `con` in your code. It is just calculating by grouping variable 'Diag'. — akrun, Jul 30 '15 at 07:25
May be `library(data.table);DT <- rbindlist(lapply(setdiff(unique(dat$Diag), 'Con'), function(x) rbind(dat[dat$Diag=='Con',], dat[dat$Diag==x,])), idcol=TRUE);DT[, (Score[Diag!='Con']-mean(Score[Diag=='Con']))/sd(Score[Diag=='Con']), .(.id, Time)]` — akrun, Jul 30 '15 at 07:30
First says `Error in rbindlist(lapply(setdiff(unique(dat$Diag), "Con"), function(x) rbind(dat[dat$Diag == : unused argument (idcol = TRUE)` When I remove idcol from the code, and run the second one, says `Error in eval(expr, envir, enclos) : object '.id' not found` — Sid0311, Jul 30 '15 at 07:35
I think the `idcol=TRUE` is from the devel version of `data.table`. You can find the instructions to install from [here](https://github.com/Rdatatable/data.table/wiki/Installation) — akrun, Jul 30 '15 at 07:36
possible duplicate of [creating z-scores](http://stackoverflow.com/questions/6148050/creating-z-scores) — Jaap, Jul 30 '15 at 07:45
@Jaap It is for calculating the z-scores but not based on by group and also with respect to a control group — akrun, Jul 30 '15 at 07:47
It refuses to run on my mac even though I have Command Lines Tools installed. I'll keep trying, meanwhile is there another way to do this using `plyr` or `dplyr`? — Sid0311, Jul 30 '15 at 07:49
You can use a combination of `unnest` from tidyr and dplyr to get the same output — akrun, Jul 30 '15 at 08:07
Try `library(tidyr)/library(dplyr);lst <- lapply(setdiff(unique(dat$Diag), 'Con'), function(x) rbind(dat[dat$Diag=='Con',], dat[dat$Diag==x,])); res <- unnest(lst, .id) %>% group_by(.id, Time) %>% mutate(Mean=mean(Score[Diag=='Con']), Sd=ore[Diag=='Con'])) %>% filter(Diag!='Con') %>% mutate(val=(Score-Mean)/Sd)` — akrun, Jul 30 '15 at 08:12
@akrun appears to work fine but I get this one error `Error in eval(expr, envir, enclos) : object 'ore' not found` — Sid0311, Jul 30 '15 at 08:22
@Sid0311 I think there was a typo while I copied. It should be `Sd= Score[Diag=='Con']))` — akrun, Jul 30 '15 at 09:17

Compute z-score by two groups

0 Answers0