1

I would like to rename some columns within a dplyr chain after casting. However, I don't know how to call the names of the current structure within the rename.vars function.

I have this data.frame:

library(gdata)
library(dplyr)
library(reshape2)

dat <- structure(list(user = c(1101L, 1102L, 1103L, 1104L, 1105L, 1101L, 
                        1102L, 1103L, 1104L, 1105L, 1101L, 1102L, 1103L, 1104L, 1105L, 
                        1101L, 1102L, 1103L, 1104L, 1105L), campaign = structure(c(1L, 
                                                                                   2L, 1L, 2L, 3L, 3L, 4L, 5L, 2L, 1L, 1L, 3L, 3L, 2L, 3L, 2L, 1L, 
                                                                                   4L, 3L, 2L), .Label = c("A", "B", "C", "D", "E"), class = "factor"), 
               impression_number = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
                                     2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L)), .Names = c("user", 
                                                                                              "campaign", "impression_number"), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                    -20L))

which looks like this:

   user campaign impression_number
1  1101        A                 1
2  1102        B                 1
3  1103        A                 1
4  1104        B                 1
5  1105        C                 1
6  1101        C                 2

When I try to run the following command, it errors because I am not referencing the names of the current object:

dat %>%
  dcast(user ~ impression_number, value.var = 'campaign') %>%
  rename.vars(names(.)[2:5], paste0('impression_', names(.)[2:5]))

Ideally, I want this data frame:

             user  impression_1  impression_2  impression_3  impression_4
1            1101             A             C             A             B
2            1102             B             D             C             A
3            1103             A             E             C             D
4            1104             B             B             B             C
5            1105             C             A             C             B

What can I do to refer to the names of the current object? I've also tried lhs from the documentation, but that's just a placeholder and didn't work either.

Thanks in advance!

maloneypatr
  • 3,562
  • 4
  • 23
  • 33
  • 2
    Maybe just change `impression_number` first with `mutate`? Like `mutate(impression_number = paste("impression", impression_number, sep = "_"))` and then chain to the casting. – aosmith Sep 18 '14 at 21:24
  • That definitely works, but I am looking for a universal way to call the existing object within the chain. I've run into this problem before in different instances, but finally realized this was an easy example to finally put on here. Thanks! – maloneypatr Sep 18 '14 at 21:27
  • Another example of when I would like to call the structure is when I try to filter out incomplete rows. Normally, I use `dat[complete.cases(dat), ]`, but I couldn't use this within a `dplyr` chain. – maloneypatr Sep 19 '14 at 13:59

1 Answers1

4

I think this is one of things that do is for. Prior to do you would have written an anonymous function. This answer gives a nice example of how to work with complete.cases (which you referred to in a follow-up comment), including an anonymous function and do.

For renaming, you just need to put rename.vars inside do and using . to refer to the dataset.

dat %>%
  dcast(user ~ impression_number, value.var = 'campaign') %>%
  do(rename.vars(., names(.)[2:5], paste0('impression_', names(.)[2:5])))
Community
  • 1
  • 1
aosmith
  • 34,856
  • 9
  • 84
  • 118
  • That's exactly what I was looking for @aosmith! I knew it was doable, but I haven't come across the `do` function yet. Thanks again! – maloneypatr Sep 19 '14 at 17:41