5

My question is how do you extend rbind() to work with a data.frame subclass? I cannot seem to properly extend rbind() to work with even a very simple subclass. The following example demonstrates the issue:

Subclass and method definition:

new_df2 <- function(x, ...)
{
  stopifnot(is.data.frame(x))
  structure(x, class = c("df2", "data.frame"), author = "some user")
}

rbind.df2 <- function(..., deparse.level = 1)
{
  NextMethod()
}

I realize that extending rbind() is not necessary in this case, but my grand plan is to use rbind.data.frame() on a my subclass and then add a few additional checks/attributes to its result.

If you call the following, you get an error: Error in NextMethod() : generic function not specified.

does not work:

t1 <- data.frame(a = 1:12, b = month.abb)
t2 <- new_df2(t1)
rbind(t2, t2)

I also tried using NextMethod(generic = "rbind"), but in that case, you receive this error: Error in NextMethod(generic = "rbind") : wrong value for .Method.

also does not work:

rbind.df2 <- function(..., deparse.level = 1)
{
  NextMethod(generic = "rbind")
}

rbind(t2, t2)

I'm at wits end and guess at the limits of my understanding of subclasses/methods too. Thanks for any help.

Karolis Koncevičius
  • 9,417
  • 9
  • 56
  • 89
r_alanb
  • 873
  • 8
  • 21

2 Answers2

3

I will treat the rbind() specific case below, but I will first note we could generate additional examples showing that there are not problems generally with NextMethod() when the first argument is ... (regarding the bounty request):

f <- function(..., b = 3) UseMethod("f")
f.a <- function(..., b = 3) { print("yes"); NextMethod() }
f.integer <- function(..., b = 4) sapply(list(...), "*", b)
x <- 1:10
class(x) <- c("a", class(x))
f(x)

[1] "yes"
      [,1]
 [1,]    4
 [2,]    8
 [3,]   12
 [4,]   16
 [5,]   20
 [6,]   24
 [7,]   28
 [8,]   32
 [9,]   36
[10,]   40

f(x, b = 5)

[1] "yes"
      [,1]
 [1,]    5
 [2,]   10
 [3,]   15
 [4,]   20
 [5,]   25
 [6,]   30
 [7,]   35
 [8,]   40
 [9,]   45
[10,]   50

So why doesn't rbind.df2 work?

As it turns out, rbind() and cbind() are not normal generics. First, they are internally generic; see the "Internal Generics" section here from Hadley Wickham's old S3 page on Advanced R, or this excerpt from the current Advanced R:

Some S3 generics, like [, sum(), and cbind(), don’t call UseMethod() because they are implemented in C. Instead, they call the C functions DispatchGroup() or DispatchOrEval().

This isn't quite enough to cause us trouble, as we can see using sum() as an example:

sum.a <- function(x, na.rm = FALSE) { print("yes"); NextMethod() } 
sum(x)

[1] "yes"
[1] 55

However, for rbind and cbind it's even weirder, as recognized in comments in the source code (starting around line 1025):

/* cbind(deparse.level, ...) and rbind(deparse.level, ...) : */
/* This is a special .Internal */

... (Some code omitted) ...

    /* Lazy evaluation and method dispatch based on argument types are
     * fundamentally incompatible notions.  The results here are
     * ghastly.

After that, some explanation of the dispatch rules are given, but so far I haven't been able to use that information to make NextMethod() work. In the use case given above, I would follow the advice of F. Privé from the comments and do this:

new_df2 <- function(x, ...)
{
    stopifnot(is.data.frame(x))
    structure(x, class = c("df2", "data.frame"))
}

rbind.df2 <- function(..., deparse.level = 1)
{
    print("yes") # Or whatever else you want/need to do
    base::rbind.data.frame(..., deparse.level = deparse.level)
}

t1 <- data.frame(a = 1:12, b = month.abb)
t2 <- new_df2(t1)
rbind(t2, t2)

[1] "yes"
    a   b
1   1 Jan
2   2 Feb
3   3 Mar
4   4 Apr
5   5 May
6   6 Jun
7   7 Jul
8   8 Aug
9   9 Sep
10 10 Oct
11 11 Nov
12 12 Dec
13  1 Jan
14  2 Feb
15  3 Mar
16  4 Apr
17  5 May
18  6 Jun
19  7 Jul
20  8 Aug
21  9 Sep
22 10 Oct
23 11 Nov
24 12 Dec
duckmayr
  • 16,303
  • 3
  • 35
  • 53
  • Thanks for the reply. However I think this is incorrect. In your example you have `data.frame` as the first class. Which means when you call `rbind(t1, t2)` it calls `rbind.data.frame` (based on the first class). And as a result the `rbind.df2` is not called and "yes" string never printed. – Karolis Koncevičius Sep 17 '18 at 10:16
  • @KarolisKoncevičius Thanks for the comment, I didn't catch that. Will investigate further, though please see edits regarding an additional example where for sure `NextMethod()` works with `...` as the first argument. – duckmayr Sep 17 '18 at 10:18
  • Thanks for looking into it. I think your example demonstrates that the error is not because of the ellipses (which was just a guess). No idea then why else would `rbind` stop working. But I would be really interested to learn how to use NextMethod with rbind properly. The error is about wrong ".Method" specification in my case. Maybe assigning the ".Method" manually is a possibility? – Karolis Koncevičius Sep 17 '18 at 10:29
  • thanks for looking so deeply into this. I cannot accept the answer, as I am not the author of the question, but already gave it +1. Will wait a bit more to see if any other answers are added. But will gladly award the bounty to this answer (unless something elegant that makes NextMethod work comes along). – Karolis Koncevičius Sep 17 '18 at 12:23
1

The answer is to extend rbind2, not rbind. From the help page from rbind2:

"These are (S4) generic functions with default methods.

...

The main use of cbind2 (rbind2) is to be called recursively by cbind() (rbind()) when both of these requirements are met:

  • There is at least one argument that is an S4 object, and

  • S3 dispatch fails (see the Dispatch section under cbind)."

JDL
  • 1,496
  • 10
  • 18
  • Thank you for the comment. But the question is about S3 generics, not S4. However I think it's nice to have it here as people looking for such answers will be informed about the possibility of using S4. – Karolis Koncevičius Sep 17 '18 at 10:24
  • You can have S4 methods with S3 classes; just call `setOldClass("myClass")` first. This is the standard technique when you want to use S3 classes, but the simple S3 method dispatch does not meet your needs. – JDL Sep 18 '18 at 07:14
  • Sounds interesting. But I am a bit unsure at which point `setOldClass` should be called. Should I call it within the generic and have it switch to another generic of S4 class within it? – Karolis Koncevičius Sep 18 '18 at 11:24
  • `setOldClass` is called as a standalone, at the same point you would call `setClass`. I.e. as early as possible, and before you define the method for `rbind2`. – JDL Sep 18 '18 at 12:57