6

I am in the process of adding a standard S3 method dispatch system to my package (OneR) where I have one method for data frames and one for formulas.

The problem I have is that I have different arguments for both methods. I don't need the data argument when calling the data frame method because the data is already present in the x argument. data is only needed when calling the formula method.

I did it like this:

Usage

optbin(x, data, method = c("logreg", "infogain", "naive"), na.omit = TRUE)

## S3 method for class 'formula'
optbin(x, data, method = c("logreg", "infogain", "naive"),
  na.omit = TRUE)

## S3 method for class 'data.frame'
optbin(x, data = x, method = c("logreg", "infogain",
  "naive"), na.omit = TRUE)


Arguments

x  either a formula or a data frame with the last column containing the target variable.
data  data frame which contains the data, only needed when using the formula interface because otherwise 'x' will already contain the data.
method  character string specifying the method for optimal binning, see 'Details'; can be abbreviated.
na.omit  logical value whether instances with missing values should be removed.

I first thought that I could just leave out the data argument in the data frame method but when checking the package I get a warning because it is present in the UseMethod function... when I leave it out there I get another warning because of the inconsistencies between the methods. I also tried ... but I also get warnings there besides I have to document it which would confuse users more than it would help.

But I also don't find my solution above ideal because of the data = x argument in the data frame method. It could confuse people and is a potential source of errors.

My question
What is the best way to resolve the situation, i.e. when you have two methods with different arguments?

Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
vonjd
  • 4,202
  • 3
  • 44
  • 68
  • Couldn't you just use `data = NULL` in `optbin.data.frame` and document it as being unused? Something similar is done for the `drop` parameter in `data.table::data.table`: *"Never used by data.table. Do not use. It needs to be here because data.table inherits from data.frame."*. – nrussell May 01 '17 at 11:14
  • @nrussell: I think this is a valid approach... could you please create an answer out of it - Thank you – vonjd May 01 '17 at 12:43

1 Answers1

7

The usual approach is to have a generic with no extra arguments except .... Each interface method should call to an underlying default method that implements the actual model-fitting.

optbin <- function(x, ...)
UseMethod("optbin")

optbin.formula <- function(formula, data, method, na.omit, arg1, arg2, ...)
{
  ...
  optbin.default(x, y, arg1, arg2)
}

optbin.data.frame <- function(data, method, na.omit, arg1, arg2, ...)
{
  ...
  optbin.default(x, y, arg1, arg2)
}

optbin.default <- function(x, y, arg1, arg2)
{ ... }

See for example how the nnet and MASS packages handle methods for formulas.

vonjd
  • 4,202
  • 3
  • 44
  • 68
Hong Ooi
  • 56,353
  • 13
  • 134
  • 187