R how to create pivot table-like data frame while 3 variables are involved?

Question

I have the following data frame in R which gives me the customers 1,2, and 3's transactional record. Each row shows the period type that the transaction has been made and the amount of money he spent.

id<-c(1,2,3,1,1,2,3,2,2)
period<-c("calib","valid","valid","calib","valid","valid","calib","calib","valid")
spent<-c(10,3,8,12,5,5,4,3,5)
df<-data.frame(id,period,spent)

now I need to create a new data frame which gives me the average 'spent' of each 'id' per transaction in different 'period's. The resulted table which I got in in excel pivot table should be as follow:

id  calib    valid
1     11       5
2     3        4.33
3     4        8

I know there should be a way to make this work in R but since I am new in R I'm not aware of it. I wonder if anyone can help me with this.

Down-voter, please indicate your reason for down-voting so that OP can learn and eventually improve his questions. — Roman Luštrik, Jul 22 '12 at 06:59

score 3 · Accepted Answer · answered Jul 22 '12 at 03:22

3

You can do this using dcast from the reshape2 package (among numerous other ways, I'm sure):

library(reshape2)
dcast(df,id~period,fun.aggregate = mean)

  id calib    valid
1  1    11 5.000000
2  2     3 4.333333
3  3     4 8.000000

(Note that I'm assuming you intended to include the spent vector in your data frame.)

answered Jul 22 '12 at 03:22

joran

169,992
32
429
468

thank you for your help. it works on my sample data frame but it is not still working on my main data frame. I think here since we just have 3 columns, it works but what if we have another column in the data frame; how can I specify that I need the id~period based on 'spent' average and not the other columns? – AliCivil Jul 22 '12 at 03:43
@AliTamaddoni If you run this code you'll see there's a warning that specifically addresses your concern and points you to the part of the documentation needed to fix it. (i.e. the `value.var` argument to `dcast`). – joran Jul 22 '12 at 03:48
actually im getting the warning message but I can't fix it – AliCivil Jul 22 '12 at 04:09
@AliTamaddoni What have you tried? I'm not sure that I, or the warning, could be any clearer. You select the column that contains the values to be aggregated with the `value.var` argument. By default, it guesses a column. Or you can set `value.var = "spent"`. – joran Jul 22 '12 at 04:12

score 3 · Answer 2 · answered Jul 22 '12 at 07:03

Here's a base solution for completeness.

I edited your code, you forgot to include spent into your end data.frame.

df <- data.frame(id, period, spent)
with(df, tapply(spent, INDEX = list(id, period), FUN = mean))

  calib    valid
1    11 5.000000
2     3 4.333333
3     4 8.000000

Sam Firke · Answer 3 · 2015-03-06T16:35:23.933

2

And using the dplyr and tidyr packages:

df %>%
  group_by(id,period) %>%
  summarise (avg = mean(spent)) %>%
  spread(period,avg) # reshapes dataframe from long to wide

edited Mar 06 '15 at 16:35

answered Feb 06 '15 at 16:29

Sam Firke

21,571
9
87
105

R how to create pivot table-like data frame while 3 variables are involved?

3 Answers3

Linked