1

what's the easiest way to calculate row-wise sums? For example if I wanted to calculate the sum of all variables with "txt_"? (see example below)

df <- data.frame(var1 = c(1, 2, 3),
                 txt_1 = c(1, 1, 0),
                 txt_2 = c(1, 0, 0),
                 txt_3 = c(1, 0, 0))

D. Studer
  • 1,711
  • 1
  • 16
  • 35
  • 1
    what result are you expecting? One sum or sum for each appropriate column? – Jon Spring Feb 16 '22 at 23:07
  • sum = c(3, 1, 0) – D. Studer Feb 16 '22 at 23:09
  • 1
    It's not tidyverse, but `rowSums(df[startsWith(names(df), "txt")])` would do it. See previous questions like https://stackoverflow.com/questions/13482532/sum-cells-of-certain-columns-for-each-row and https://stackoverflow.com/questions/45237034/sum-up-certain-variables-columns-by-variable-names and https://stackoverflow.com/questions/43515925/rowsums-conditional-on-column-name – thelatemail Feb 16 '22 at 23:10

2 Answers2

1

base R

We can first use grepl to find the column names that start with txt_, then use rowSums on the subset.

rowSums(df[, grepl("txt_", names(df))])

[1] 3 1 0

If you want to bind it back to the original dataframe, then we can bind the output to the original dataframe.

cbind(df, sums = rowSums(df[, grepl("txt_", names(df))]))

  var1 txt_1 txt_2 txt_3 sums
1    1     1     1     1    3
2    2     1     0     0    1
3    3     0     0     0    0

Tidyverse

library(tidyverse)

df %>% 
  mutate(sum = rowSums(across(starts_with("txt_"))))

  var1 txt_1 txt_2 txt_3 sum
1    1     1     1     1   3
2    2     1     0     0   1
3    3     0     0     0   0

Or if you want just the vector, then we can use pull:

df %>% 
  mutate(sum = rowSums(across(starts_with("txt_")))) %>% 
  pull(sum)

[1] 3 1 0

Data Table

Here is a data.table option as well:

library(data.table)
dt <- as.data.table(df)

dt[ ,sum := rowSums(.SD), .SDcols = grep("txt_", names(dt))]

dt[["sum"]]
# [1] 3 1 0
AndrewGB
  • 16,126
  • 5
  • 18
  • 49
1

Another dplyr option:

df %>% 
  rowwise() %>%
  mutate(sum = sum(c_across(starts_with("txt"))))
Jon Spring
  • 55,165
  • 4
  • 35
  • 53