17

What is purpose of dot before variables (i.e. "variables") in the R Plyr package?

for instance, from the R help file:

ddply(.data, .variables, .fun = NULL, ...,
    .progress = "none", .drop = TRUE, .parallel = FALSE)

Any assistance would be greatly appreciated

MikeTP
  • 7,716
  • 16
  • 44
  • 57
  • 5
    I don't know, but they've always annoyed me because you can't use tab-completion for any of the function arguments. – Joshua Ulrich Jan 30 '13 at 16:35
  • 1
    When it came out, I hesitated using it, because I thought there was something magically going on between the dots. – Dieter Menne Jan 30 '13 at 16:40
  • Dieter: that is my concern! – MikeTP Jan 30 '13 at 16:51
  • 1
    You can see the magic for yourself: `library(plyr); . ; ?"."`. Its also worth taking a look at the Hadley magic of `as.quoted` and the discussion [here](http://stackoverflow.com/questions/12850141/programming-safe-version-of-subset-to-evaluate-its-condition-while-called-from/12850252#12850252) – Justin Jan 30 '13 at 16:54
  • @JoshuaUlrich why can't you use tab-completion? – hadley Jan 30 '13 at 17:50
  • 1
    @hadley: because all the function arguments match `.`, so `ddply(` results in `ddply(.`; pressing `` again then suggests all the hidden stuff in base. – Joshua Ulrich Jan 30 '13 at 17:59
  • @JoshuaUlrich I think that's a problem with however you're doing tab-completion. It seems fine for me in rstudio. – hadley Jan 30 '13 at 18:15
  • @Justin: thank for the hint. Never thought of looking into this. – Dieter Menne Jan 30 '13 at 18:17
  • @hadley: it's likely a problem with the console, but it affects me with the Windows GUI and Ubuntu command line. – Joshua Ulrich Jan 30 '13 at 18:18
  • @JoshuaUlrich -- By tab completion, do you mean typing `ddply(` to get a list of the `ddply`'s formals? Or do you mean typing `ddply(.da` to get this completion: `ddply(.data=`? The latter works just fine for me on Windows (both GUI and Emacs). The former doesn't work so well... – Josh O'Brien Jan 30 '13 at 19:24

2 Answers2

12

There may be two things going on that are confusing you.

One is the . function in the 'plyr' package. The . function allows you to use a variable as a link rather than referring to the value(s) the variable contains. For instance, in some functions, we want to refer to the object x rather than the value(s) stored in x. In the 'base' package, there is no easy, concise way of doing this, so we use the 'plyr' package to say .(x). The 'plyr' functions themselves use this a lot like so:

ddply(data, .(row_1), summarize, total=sum(row_1))

If we didn't use the . function, 'ddply' would complain, because 'row_1' contains many values, when we really just want to refer to the object.

The other "." in action here is the way people use it as a character in the function arguments' names. I'm not sure what the origin is, but a lot of people seem to do it just to highlight which variables are function arguments and which variables are only part of the function's internal code. The "." is just another character, in this case.

Dinre
  • 4,196
  • 17
  • 26
  • It may be appropriate in your first explanation to use traditional compsci language, i.e. "pass by reference" as opposed to "pass by value." – Micah Henning Dec 20 '17 at 22:45
6

From http://www.jstatsoft.org/v40/i01

Note that all arguments start with . This prevents name clashes with the arguments of the processing function, and helps to visually delineate arguments that control the repetition from arguments that control the individual steps. Some functions in base R use all uppercase argument names for this purpose, but I think this method is easier to type and read.

Dieter Menne
  • 10,076
  • 44
  • 67