aggregate
is designed to work on multiple columns with one function and returns a dataframe with one row for each category, while tapply
is designed to work on a single vector with results returned as a matrix or array. Only using a two-column matrix does not really allow the capacities of either function (or their salient differences) to be demonstrated. aggregate
also has a formula method, which tapply
does not.
> Aaa <- data.frame(amount=c(1,2,1,2,1,1,2,2,1,1,1,2,2,2,1), cat=sample(letters[21:24], 15,rep=TRUE),
+ card=c("a","b","c","a","c","b","a","c","b","a","b","c","a","c","a"))
> with( Aaa, tapply(amount, INDEX=list(cat,card), mean) )
a b c
u 1.5 1.5 NA
v 2.0 1.0 2.0
w 1.0 NA 1.5
x 1.5 NA 1.5
> aggregate(amount~cat+card, data=Aaa, FUN= mean)
cat card amount
1 u a 1.5
2 v a 2.0
3 w a 1.0
4 x a 1.5
5 u b 1.5
6 v b 1.0
7 v c 2.0
8 w c 1.5
9 x c 1.5
The xtabs
function also delivers an R "table" and it has a formula interface. R tables are matrices that typically have integer values because they are designed to be "contingency tables" holding counts of items in cross-classifications of the marginal categories.