0

I wish to RowSums the total number of columns (in this case years of education) but only if a value (the age of respondents) is greater than a certain number (>=16). The number of columns is greater than the example (up to 13 for age and education) so I wish to find an efficient way to achieve the RowSums without relying on a column by column sum and by keeping the structure of the proposed dataframe as it is since I wish to cbind more columns afterwards.

What is the best way to get from this dataframe [...]

Age1 <- c(21,31,51,72)
Age2 <- c(22,33,34,54)
Age3 <- c(7,11,10,21)
Edu1 <- c(5,10,10,10)
Edu2 <- c(5,10,5,5)
Edu3 <- c(2,5,4,10)
df <- data.frame(Age1, Age2, Age3, Edu1, Edu2, Edu3)

[...] to TotEdu results?

enter image description here

Nicola
  • 446
  • 7
  • 17

1 Answers1

2

We could define the column numbers for age and education assuming the number are same always (here both are 3), check which age values are greater than equal to 16 and get the corresponding education value and take rowSums.

age_cols <- 1:3
edu_cols <- 4:6
df$Total_edu <- rowSums(df[edu_cols] * as.numeric(df[age_cols] >= 16))

df

#  Age1 Age2 Age3 Edu1 Edu2 Edu3 Total_edu
#1   21   22    7    5    5    2        10
#2   31   33   11   10   10    5        20
#3   51   34   10   10    5    4        15
#4   72   54   21   10    5   10        25
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • What if these columns were interspersed in-between thousands of others, would subscripting type list obtained with `grep` work the same? – Nicola Dec 01 '18 at 09:26
  • 2
    yes..the end goal is to find index of columns of interest irrespective of number of columns in between. If you don't want to hard code values you can also do it using `grep` , `age_cols <- grep("Age", names(df));edu_cols <- grep("Edu", names(df))` and it should work the same. – Ronak Shah Dec 01 '18 at 09:29