1

I have a dataframe here: each subject does 6 trials, there are 105 subjects.

I want to find the mean of 'skip' for 6 trials for each subj.

How do I start?

>     subj entropy n_gambles trial choice
1      0    high         2     0   skip
2      0    high         2     1   skip
3      0    high         2     2   skip
4      0    high         2     3   skip
5      0    high         2     4   skip
6      0    high         2     5   skip
7      1    high        32     0    buy
8      1    high        32     1    buy
9      1    high        32     2    buy
10     1    high        32     3    buy
11     1    high        32     4    buy
12     1    high        32     5    buy
Cœur
  • 37,241
  • 25
  • 195
  • 267
  • 3
    What do you mean by "mean of 'skip'"? – Roland Aug 30 '13 at 12:37
  • Do you really want 6 (which just happens to be the number of trials per subject), or do you want all trials for a given subject? – Carl Witthoft Aug 30 '13 at 12:53
  • Hi Roland, I meant the mean number of skips for 6 trials for each subj. For example, for subj 0, the mean number of 'skip' for the 6 trials is 6/6 = 1. But there are other cases where there are a mixture of buys and skips. :) – user2707619 Aug 30 '13 at 13:28
  • Hi Carl, well, all of the 105 subj does 6 trials in each conditions, so 6 trials is essential all the trials for a given subj. – user2707619 Aug 30 '13 at 13:30

2 Answers2

2

You can use ddply from plyr package: (You mentioned that there will be six trials, so mean is computed by dividing 6 for number of observations with just choice=skip for each subject)

library(plyr)
ddply(df,.(subj),summarise,mymean=(length(which(choice=="skip")))/6)
  subj mymean
1    0      1
2    1      0

Note: df is your data

Metrics
  • 15,172
  • 7
  • 54
  • 83
0

If I have to guess, then you intend to get mean of n_gambles for each subject where choice==skip, then this might work:

# Data
df<- read.table(text="subj  entropy n_gambles   trial   choice
0   high    2   0   skip
0   high    2   1   skip
0   high    2   2   skip
0   high    2   3   skip
0   high    2   4   skip
0   high    2   5   skip
1   high    32  0   buy
1   high    32  1   buy
1   high    32  2   buy
1   high    32  3   buy
1   high    32  4   buy
1   high    32  5   buy",header=T)

# Get mean
aggregate(df[df$choice == "skip","n_gambles"],
          list(subj=df[df$choice == "skip","subj"]),
          mean)

# Output
#  subj x
# 1 0 2

EDIT: As I understand you want frequency of skip per subj: Try this:

# Get counts
result <- as.data.frame(table(df$subj,df$choice))
colnames(result) <- c("subj","choice","Freq")
# Subset for "skip" and divide by 6
result <- result[ result$choice == "skip",]
result$Freq <- result$Freq/6
zx8754
  • 52,746
  • 12
  • 114
  • 209
  • I think the answer is close but the outcome 'x' is the number of gambles rather than the mean number of skips for all trials for each subj. Could you please explain this part of the code? I dont quite get what it means? subj=df[df$choice == "skip","subj"]) – user2707619 Aug 30 '13 at 13:33
  • yes, this code gives the correct number of mean skips. thanks very much. – user2707619 Aug 30 '13 at 14:21