There's a few questions here, I would be satisfied if any one of them was answered sufficiently well.
Background - what is the end goal?
I am interested in representing a date-range in R. Bare-minimum requirement is that we represent a start and end date, which can easily be done using a length-two date vector. Additionally, it would be nice to extend this object into a Class which further
- supplies a name to each range (i.e. a character string)
- enables the (easy) use of
dplyr::between
operator
Shortcomings of my previous approach
I've previously represented each range as a length-two date vector. The upside here is that I don't rely on any external dependencies and my data structure is so lightweight that it's not a hassle to program with. The downside is that I'm tired of having to access the beg
and end
of the date range via the [
operator and arguments 1
and 2
respectively (arguably less interpretable than if we had a class implementation).
Also, we ultimately deal with a sequence of date-ranges (i.e. a vector), and so abstracting away the DateRange
is helpful before we start nesting data structures. I do not want to use a list of length-two date vectors nor do I wish to use a data.frame with two rows, each column being interpreted as a date-range.
Where have I looked?
I've looked at lubridate
package and have considered inheriting from a Interval
class. The downside to starting with this inheritance is that I don't think S4 is necessary for my use case. I just need a few simple data attributes and a nice API for calling dplyr::between
.
An ideal solution might just extend the lubridate::Interval
class to hold a name, an end date (could be a method as this info already stored in Interval via @start + @.Data
), and extend dplyr::between
to play nicely with said class.
What have I tried?
Here's a rough implementation of what I'm looking for:
# 3 key attributes: beg, end, and name.
MyInterval <- function(beg, end, name = NULL) {
if (class(beg) == "character") beg <- as.Date(beg)
if (class(end) == "character") end <- as.Date(end)
if (is.null(name)) name <- as.character(beg)
structure(.Data = list('beg' = beg, 'end' = end, 'name' = name), class = "MyInterval")
}
Now, I would like to be able to overload the between
operator such that I may call it as follows: between(x, MyInterval)
, where we notice that dplyr::between(x, lo, hi)
expects three arguments. To try and accomplish this, I've tried to set up type dispatching as follows:
between <- function(...) UseMethod('between')
between.MyInterval <- function(interval, x) {
if (class(x) == "character") x <- as.Date(x)
dplyr::between(x, interval$beg, interval$end)
}
between.default <- function(x, lo, hi) dplyr::between(x, lo, hi)
The reason I chose to use ...
in the prototype for between
is that the order of arguments currently differ between between.MyInterval
and between.default
. Is there a better way to code this up? I believe the behavior is as desired (to within a first glance)
i <- MyInterval("2012-01-01", "2012-12-31")
between(i, "2012-02-01") # Dispatches to between.MyInterval. Returns True as expected.
between(150, 100, 200) # Dispatches to dplyr::between. Good, we didn't break anything?
Thank you
Any criticisms are welcomed. I know that between
is a function that doesn't do type-dispatching out of the box, and so implementing this myself raises a code smell.