2

How do I get Pivots with subtotals (like in MS Excel Pivot Tables) in R?

I am using dcast from reshape2 package to create pivots in R. I also got grand totals working using rowSums and colSums. I admit I do not understand the intricacies in the dcast parameter set. I just know how to create the pivot and the help file is going over my head. It will be extremely helpful if someone can solve this using dcast (I suspect it can do it all), and explain the parameters necessary for the solution.

I am using this code (C2 has two factors, X1 & X2):

PIV <- dcast(DF, C1~C2, value.var="C3", sum)

I am not getting what all these margins, subset, fills or drop does. The help file seems overly obtuse, and I did not get much help on the internet as well.

Sample Pivot (without subtotal, T is total):

C1 X1 X2 T
a  12 1  13
a  14 2  16
b  16 3  19
b  11 4  15
b  8  5  13
T  61 15 76

Sample Expected Pivot (with subtotal t, T is total)

C1 X1 X2 T
a  12 1  13
a  14 2  16
ta 26 3  29
b  16 3  19
b  11 4  15
b  8  5  13
tb 35 12 47
T  61 15 76
Arani
  • 753
  • 1
  • 9
  • 23
  • Try `library(data.table);rbind(setDT(df1), df1[, lapply(.SD, sum), C1][, C1:= paste0("t", C1)])` – akrun Dec 18 '17 at 13:04

1 Answers1

3

Assuming you're starting without any totals, with something like this:

mydf <- structure(list(C1 = c("a", "a", "b", "b", "b"), X1 = c(12L, 14L, 
    16L, 11L, 8L), X2 = 1:5), .Names = c("C1", "X1", "X2"), row.names = c(NA, 
    5L), class = "data.frame")

mydf
##   C1 X1 X2
## 1  a 12  1
## 2  a 14  2
## 3  b 16  3
## 4  b 11  4
## 5  b  8  5

Then you would have to use the margins argument for dcast to get the output you're after.

library(reshape2)
mydfl <- melt(mydf)
mydfl$ind <- with(mydfl, ave(C1, C1, variable, FUN = seq_along))
dcast(mydfl, C1 + ind ~ variable, sum, 
      margins = c("C1", "ind", "variable"))
#      C1   ind X1 X2 (all)
# 1     a     1 12  1    13
# 2     a     2 14  2    16
# 3     a (all) 26  3    29
# 4     b     1 16  3    19
# 5     b     2 11  4    15
# 6     b     3  8  5    13
# 7     b (all) 35 12    47
# 8 (all) (all) 61 15    76

The "margins" argument is what is used to get the subtotals. Here, we're taking subtotals according to the "C1" (Grand Total) and "ind" variable (Subtotal), and we're also adding row totals (using "variable").

That said, I'm not sure if I'd recommend this as you're sort of mixing data and analysis in the same table.

Arani
  • 753
  • 1
  • 9
  • 23
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
  • Thanks a lot. I suppose then `reshape2` `margins` might work the same way. I will try that. Basically I am trying to automate report generation so that I don't have to do this everyday. The analysis part is done and I got stuck at the subtotal front as that is a good MIS practice when reporting to various departments at once (C1 is the department name). – Arani Dec 18 '17 at 22:04
  • @Arani, cool. Just realized that "reshape2" does have the `margins` argument, but it works a little bit differently. Edited the answer. I've been using `dcast.data.table` so much recently (which hasn't yet implemented `margins`) that I mistakenly said it was not carried over to "reshape2". – A5C1D2H2I1M1N2O1R2T1 Dec 19 '17 at 01:04
  • "grand_row" and "grand_col" was not working for me, even though they were in the documentation! However, I just needed to add the factor column to margin and it worked like butter. – Arani Dec 19 '17 at 06:45
  • 1
    @Arani, yes, I realised that and updated my answer. Glad it worked! – A5C1D2H2I1M1N2O1R2T1 Dec 19 '17 at 07:04