6

I know commands like xtabs and table allow a user to do cross-tabulation

For example the following command generates a pivot table that shows the number of cars that have the same number of gears and cylinders.

> xtabs(~cyl+gear, data = mtcars)
   gear
cyl  3  4  5
  4  1  8  2
  6  2  4  1
  8 12  0  2
> 

We can extend the formula so it could show the sum of the horse power for the cars in each bin

> xtabs(hp~cyl+gear, data = mtcars)
   gear
cyl    3    4    5
  4   97  608  204
  6  215  466  175
  8 2330    0  599
> 

I am now wondering, is it possible to calculate the mean of horse powers for cars in each bin? for example something like this xtabs(mean(hp)~cyl+gear, data = mtcars)

Mark
  • 10,754
  • 20
  • 60
  • 81
  • 2
    I'm not sure how to do it with `xtabs` (which I've never used before), but to do it with the `reshape` package, one way is `cast(melt(mtcars, id = c("cyl", "gear")), cyl ~ gear, subset = variable == "hp", mean)`. – grautur Jul 23 '11 at 05:34
  • 1
    xtabs(hp~cyl+gear, data = mtcars)/xtabs(~cyl+gear, data = mtcars) – jverzani Jul 23 '11 at 13:48

4 Answers4

10

You can do it in one line using cast from the reshape library

cast(mtcars, cyl ~ gear, value = 'hp', fun = mean)
Ramnath
  • 54,439
  • 16
  • 125
  • 152
7

One interesting response that I received from r-help is as following:

> attach(mtcars)
> tapply(hp,list(cyl,gear),mean)
         3     4     5
4  97.0000  76.0 102.0
6 107.5000 116.5 175.0
8 194.1667    NA 299.5
> 
Mark
  • 10,754
  • 20
  • 60
  • 81
  • 1
    Yes, this is the proper thing!!! I don't use xtabs, I just use the standard tapply, apply, lapply function since they accomplish everything. From the first time I new it must be solvable with standard tapply, and it is! Thanks. – Tomas Jul 23 '11 at 07:30
  • 1
    +1 for base, but no need to attach, just use function(x) list(x$cyl,x$gear), mean) in apply statement. Attach is bad programming practice and can lead to big problems later on. – Brandon Bertelsen Jul 23 '11 at 07:57
  • 3
    Or `with(mtcars, tapply(hp, list(cyl, gear), mean))` – Martin Morgan Jul 23 '11 at 12:21
3

(Moving my comment to a response, so I can better edit it.)

I'm not sure how to do it with xtabs (which I've never used before), but here are a couple of ways of doing it using the reshape and plyr packages.

> x = melt(mtcars, id = c("cyl", "gear"), measure = c("hp"))
> cast(x, cyl ~ gear, mean)

> x = ddply(mtcars, .(cyl, gear), summarise, hp = mean(hp))
> cast(x, cyl ~ gear)
grautur
  • 29,955
  • 34
  • 93
  • 128
0

Another way of calculating it is by using the aggregate() function. Although the output is not in the form of a table. (via twitter)

> aggregate(hp~cyl+gear,data=mtcars,mean)
  cyl gear       hp
1   4    3  97.0000
2   6    3 107.5000
3   8    3 194.1667
4   4    4  76.0000
5   6    4 116.5000
6   4    5 102.0000
7   6    5 175.0000
8   8    5 299.5000
> 
Mark
  • 10,754
  • 20
  • 60
  • 81