1

Suppose this table:

Browse[2]> tra_all_data
   ID CITY COUNTRY PRODUCT   CATEGORY YEAR INDICATOR COUNT
1   1  VAL      ES  Tomato Vegetables 1999        10    10
2   2  MAD      ES    Beer    Alcohol 1999        20    20
3   3  LON      UK  Whisky    Alcohol 1999        30    30
4   4  VAL      ES  Tomato Vegetables 2000       100   100
5   5  VAL      ES    Beer    Alcohol 2000       121   121
6   6  LON      UK  Whisky    Alcohol 2000       334   334
7   7  MAD      ES  Tomato Vegetables 2000       134   134
8   8  LON      UK  Tomato Vegetables 2000       451   451
17 17  BIL      ES  Pincho       Meat 1999       180   180
18 18  VAL      ES  Orange Vegetables 1999       110   110
19 19  MAD      ES    Wine    Alcohol 1999       120   120
20 20  LON      UK    Wine    Alcohol 1999       230   230
21 21  VAL      ES  Orange Vegetables 2000       100   100
22 22  VAL      ES    Wine    Alcohol 2000       122   122
23 23  LON      UK      JB    Alcohol 2000       133   133
24 24  MAD      ES  Orange Vegetables 2000       113   113
25 25  MAD      ES  Orange Vegetables 2000       113   113
26 26  LON      UK  Orange Vegetables 2000       145   145

And this piece of code:

CURRENT_COLS<-c("PRODUCT", "YEAR", "CITY")
tra_dAGG <- tra_all_data
    regroup(as.list(CURRENT_COLS)) %>%
    #group_by(PRODUCT, YEAR, CITY) %>%
    summarise(Percent = sum(COUNT)) %>%
    mutate(Percent = Percent / sum(Percent))

If I use this code as it is, I get the following warning:

Warning message:
'regroup' is deprecated.
Use 'group_by_' instead.
See help("Deprecated")

If I comment the regroup line and use the group_by line, it works but the point is that CURRENT_COLS changes in each iteration and I need to use this variable (I have explicitly defined CURRENT_COLS in this code to better explain my question)

Can anyone help me on this issue? How can I use a variable in the group_by?

Thank you so much in advance.

My R version: 3.1.2 (2014-10-31)

talat
  • 68,970
  • 21
  • 126
  • 157
Adolfo
  • 11
  • 4
  • 4
    You probably need to use the newer standard evaluation versions of dplyr's functions. Try to replace the regroup by: `group_by_(.dots = CURRENT_COLS)` and make sure you have the current versions of dplyr and lazyeval installed and loaded. – talat Jan 16 '15 at 13:09
  • Thank you, it nicely works!! – Adolfo Jan 19 '15 at 09:01
  • okay I will post my comment as answer so it won't remain an open question forever – talat Jan 19 '15 at 09:04

1 Answers1

0

You need to use the newer standard evaluation versions of dplyr's functions. They are denoted by an additional _ at the end of the function name, for example select_().

In your case, you can change your code to:

CURRENT_COLS<-c("PRODUCT", "YEAR", "CITY")
tra_dAGG <- tra_all_data
    group_by_(.dots = CURRENT_COLS) %>%
    summarise(Percent = sum(COUNT)) %>%
    mutate(Percent = Percent / sum(Percent))

Make sure you have the latest versions of dplyr installed and loaded.

To learn more about standard/non-standard evaluation in dplyr, see the vignette NSE.

talat
  • 68,970
  • 21
  • 126
  • 157