2

I have a dataframe with 12 variables:

id_group1, id_group2, ..., id_group11 : 11 variables with a numeric value

mean_id: mean over all the above mentioned id_group variables

What I would need now is a new variable that contains the rowsum only for id_group variables whose value is LARGER THAN mean_id.

I am new to R and am still struggling with seemingly simple operations - so far I have tried using ifelse constructions but it never seemed to work.

Does anyone have an idea how to go about this?

SecretAgentMan
  • 2,856
  • 7
  • 21
  • 41
Sarah1989
  • 23
  • 2

1 Answers1

0

Here is one option with apply. Loop over the rows (assuming that the last column is 'mean_id', subset the other elements that are greater than the 12th and get the sum

apply(df1, 1, function(x) sum(x[-12][x[-12] > x[12]], na.rm = TRUE))
#[1] 42 40 52 39 50 51 49 49 24 27

or with rowSums, we replace the elements in the columns other than 12th, where thee value is less than or equal to mean column and get the rowSums

rowSums(replace(df1[-12], df1[-12] <= df1[,12], NA), na.rm = TRUE)
#[1] 42 40 52 39 50 51 49 49 24 27

data

set.seed(24)
df1 <- as.data.frame(matrix(sample(1:8, 11 * 10, replace = TRUE), 
     ncol = 11, dimnames = list(NULL, paste0("id_group", 1:11))))
df1$mean_id <- sample(1:6, 10, replace = TRUE)
akrun
  • 874,273
  • 37
  • 540
  • 662