What is the difference of tapply and aggregate in R?

Question

Aaa <- data.frame(amount=c(1,2,1,2,1,1,2,2,1,1,1,2,2,2,1), 
                  card=c("a","b","c","a","c","b","a","c","b","a","b","c","a","c","a"))

aggregate(x=Aaa$amount, by=list(Aaa$card), FUN=mean)

##   Group.1    x
## 1       a 1.50
## 2       b 1.25
## 3       c 1.60

tapply(Aaa$amount, Aaa$card, mean)

##    a    b    c 
## 1.50 1.25 1.60

Above is an example code.

It seems that aggregate and tapply both are very handy and perform similar functionality.

Can someone explain or give examples on their differences?

You just gave the examples. Examine them. If you save the output in a variable you can look at the `class`, `summary`, and structure (`str`) for starters. — John, Sep 22 '14 at 03:05

score 16 · Accepted Answer · edited Feb 14 '19 at 20:49

aggregate is designed to work on multiple columns with one function and returns a dataframe with one row for each category, while tapply is designed to work on a single vector with results returned as a matrix or array. Only using a two-column matrix does not really allow the capacities of either function (or their salient differences) to be demonstrated. aggregate also has a formula method, which tapply does not.

> Aaa <- data.frame(amount=c(1,2,1,2,1,1,2,2,1,1,1,2,2,2,1), cat=sample(letters[21:24], 15,rep=TRUE),
+                   card=c("a","b","c","a","c","b","a","c","b","a","b","c","a","c","a"))
> with( Aaa, tapply(amount, INDEX=list(cat,card), mean) )
    a   b   c
u 1.5 1.5  NA
v 2.0 1.0 2.0
w 1.0  NA 1.5
x 1.5  NA 1.5

>  aggregate(amount~cat+card, data=Aaa, FUN= mean) 
  cat card amount
1   u    a    1.5
2   v    a    2.0
3   w    a    1.0
4   x    a    1.5
5   u    b    1.5
6   v    b    1.0
7   v    c    2.0
8   w    c    1.5
9   x    c    1.5

The xtabs function also delivers an R "table" and it has a formula interface. R tables are matrices that typically have integer values because they are designed to be "contingency tables" holding counts of items in cross-classifications of the marginal categories.

What is the difference of tapply and aggregate in R?

1 Answers1