Create multiple sums

Question

Ciao, Here is a replicate able example.

   df <- data.frame("STUDENT"=c(1,2,3,4,5),
                     "TEST1A"=c(NA,5,5,6,7),
                     "TEST2A"=c(NA,8,4,6,9),
                     "TEST3A"=c(NA,10,5,4,6),
                     "TEST1B"=c(5,6,7,4,1),
                     "TEST2B"=c(10,10,9,3,1),
                     "TEST3B"=c(0,5,6,9,NA),
                     "TEST1TOTAL"=c(NA,23,14,16,22),
                     "TEST2TOTAL"=c(10,16,15,12,NA))

I have columns STUDENT through TEST3B and want to create TEST1TOTAL TEST2TOTAL. TEST1TOTAL=TEST1A+TEST2A+TEST3A and so on for TEST2TOTAL. If there is any missing score in TEST1A TEST2A TEST3A then TEST1TOTAL is NA.

here is my attempt but is there a solution with less lines of coding? Because here I will need to write this line out many times as there are up to TEST A through O.

TEST1TOTAL=rowSums(df[,c('TEST1A', 'TEST2A', 'TEST3A')], na.rm=TRUE)

Your `TEST2TOTAL` seems to be wrong. Could you please verify it? — Saurabh Chauhan, Sep 18 '18 at 16:49
Hang on, does `TEST1TOTAL` refer to the total of tests numbered 1 (as would be logical from the name), or tests lettered A (as shown)? — divibisan, Sep 18 '18 at 17:06

nghauran · Answer 1 · 2018-09-18T16:59:36.920

2

Try:

library(dplyr)
df %>%
        mutate(TEST1TOTAL = TEST1A+TEST2A+TEST3A,
               TEST2TOTAL = TEST1B+TEST2B+TEST3B)

or

df %>%
        mutate(TEST1TOTAL = rowSums(select(df, ends_with("A"))),
               TEST2TOTAL = rowSums(select(df, ends_with("B"))))

edited Sep 18 '18 at 16:59

answered Sep 18 '18 at 16:49

nghauran

6,648
2
20
29

Jilber Urbina · Accepted Answer · 2018-09-18T17:07:15.760

Using just R base functions:

output <- data.frame(df1, do.call(cbind, lapply(c("A$", "B$"), function(x) rowSums(df1[, grep(x, names(df1))]))))

Customizing colnames:

> colnames(output)[(ncol(output)-1):ncol(output)] <- c("TEST1TOTAL", "TEST2TOTAL")
> output
  STUDENT TEST1A TEST2A TEST3A TEST1B TEST2B TEST3B TEST1TOTAL TEST2TOTAL
1       1     NA     NA     NA      5     10      0         NA         15
2       2      5      8     10      6     10      5         23         21
3       3      5      4      5      7      9      6         14         22
4       4      6      6      4      4      3      9         16         16
5       5      7      9      6      1      1     NA         22         NA

divibisan · Answer 3 · 2018-09-18T17:12:18.733

I think for what you want, Jilber Urbina's solution is the way to go. For completeness sake (and because I learned something figuring it out) here's a tidyverse way to get the score totals by test number for any number of tests.

The advantage is you don't need to specify the identifiers for the tests (beyond that they're numbered or have a trailing letter) and the same code will work for any number of tests.

library(tidyverse)

df_totals <- df %>%
    gather(test, score, -STUDENT) %>%                    # Convert from wide to long format
    mutate(test_num = paste0('TEST', ('[^0-9]', '', test),
                             'TOTAL'),                   # Extract test_number from variable
           test_let = gsub('TEST[0-9]*', '', test)) %>%  # Extract test_letter (optional)
    group_by(STUDENT, test_num) %>%                      # group by student + test
    summarize(score_tot = sum(score)) %>%                # Sum score by student/test
    spread(test_num, score_tot)                          # Spread back to wide format

df_totals

# A tibble: 5 x 4
# Groups:   STUDENT [5]
  STUDENT TEST1TOTAL TEST2TOTAL TEST3TOTAL
    <dbl>      <dbl>      <dbl>      <dbl>
1       1         NA         NA         NA
2       2         11         18         15
3       3         12         13         11
4       4         10          9         13
5       5          8         10         NA

If you want the individual scores too, just join the totals together with the original:

left_join(df, df_totals, by = 'STUDENT')
  STUDENT TEST1A TEST2A TEST3A TEST1B TEST2B TEST3B TEST1TOTAL TEST2TOTAL TEST3TOTAL
1       1     NA     NA     NA      5     10      0         NA         NA         NA
2       2      5      8     10      6     10      5         11         18         15
3       3      5      4      5      7      9      6         12         13         11
4       4      6      6      4      4      3      9         10          9         13
5       5      7      9      6      1      1     NA          8         10         NA

Create multiple sums

3 Answers3