2

I have a data frame (named as df) of 144 columns (trial numbers) containing the information about the trial success (Yes/No) per participant (the rows). A subset would look like this:

V1      V2      V3      V4      V5  
Yes     No      Yes     Yes     No
Yes     No      No      No      No
Yes     Yes     Yes     Yes     No

I want to count the occurrences of Yes and No outcomes per participant across 144 trials. However, I also want to subset specific trial numbers (take V1, V4, V5, V110, V112, etc.) and count the outcomes accordingly. If I write a code as:

Yes <- rowSums(df == "Yes") # Count the "No" per row
cbind(Yes, No = ncol(df) - Yes) # Subscribe these from the columns numbers and combine
#       Yes   No
# [1,]    3    2
# [2,]    1    4
# [3,]    4    1

This gives me the counts of Yes and No's per participant, but across all trials. How can I specify certain columns (trials) and count per participant?

e. erhan
  • 61
  • 6

1 Answers1

1

You can subset df using [ while comparing. Here columns 1, 4, and 5 are selected.

rowSums(df[,c(1,4,5)] == "Yes") #For column 1, 4 and 5
#[1] 2 1 2

To calculate the percentage of Yes (asked in the comments), rowMeans could be used:

100 * rowMeans(df == "Yes")
#[1] 60 20 80
GKi
  • 37,245
  • 2
  • 26
  • 48
  • Thanks! One further question. How can I create a new variable (Yes %) and calculate the percentage of Yes responses per participant? I thought something like data_c1 %>% mutate (cond1_acc = ja_c1 / 5), where ja_c1 is the object for the rowSums that you just wrote might work but gives me error – e. erhan May 12 '21 at 13:08