2

There's a few questions here, I would be satisfied if any one of them was answered sufficiently well.

Background - what is the end goal?

I am interested in representing a date-range in R. Bare-minimum requirement is that we represent a start and end date, which can easily be done using a length-two date vector. Additionally, it would be nice to extend this object into a Class which further

  • supplies a name to each range (i.e. a character string)
  • enables the (easy) use of dplyr::between operator

Shortcomings of my previous approach

I've previously represented each range as a length-two date vector. The upside here is that I don't rely on any external dependencies and my data structure is so lightweight that it's not a hassle to program with. The downside is that I'm tired of having to access the beg and end of the date range via the [ operator and arguments 1 and 2 respectively (arguably less interpretable than if we had a class implementation).

Also, we ultimately deal with a sequence of date-ranges (i.e. a vector), and so abstracting away the DateRange is helpful before we start nesting data structures. I do not want to use a list of length-two date vectors nor do I wish to use a data.frame with two rows, each column being interpreted as a date-range.

Where have I looked?

I've looked at lubridate package and have considered inheriting from a Interval class. The downside to starting with this inheritance is that I don't think S4 is necessary for my use case. I just need a few simple data attributes and a nice API for calling dplyr::between.

An ideal solution might just extend the lubridate::Interval class to hold a name, an end date (could be a method as this info already stored in Interval via @start + @.Data), and extend dplyr::between to play nicely with said class.

What have I tried?

Here's a rough implementation of what I'm looking for:

# 3 key attributes: beg, end, and name.
MyInterval <- function(beg, end, name = NULL) {
    if (class(beg) == "character") beg <- as.Date(beg)
    if (class(end) == "character") end <- as.Date(end)
    if (is.null(name)) name <- as.character(beg)
    structure(.Data = list('beg' = beg, 'end' = end, 'name' = name), class = "MyInterval")
}

Now, I would like to be able to overload the between operator such that I may call it as follows: between(x, MyInterval), where we notice that dplyr::between(x, lo, hi) expects three arguments. To try and accomplish this, I've tried to set up type dispatching as follows:

between <- function(...) UseMethod('between')
between.MyInterval <- function(interval, x) {
    if (class(x) == "character") x <- as.Date(x)
    dplyr::between(x, interval$beg, interval$end)
}
between.default <- function(x, lo, hi) dplyr::between(x, lo, hi)

The reason I chose to use ... in the prototype for between is that the order of arguments currently differ between between.MyInterval and between.default. Is there a better way to code this up? I believe the behavior is as desired (to within a first glance)

i <- MyInterval("2012-01-01", "2012-12-31")
between(i, "2012-02-01") # Dispatches to between.MyInterval. Returns True as expected.
between(150, 100, 200)   # Dispatches to dplyr::between. Good, we didn't break anything?

Thank you

Any criticisms are welcomed. I know that between is a function that doesn't do type-dispatching out of the box, and so implementing this myself raises a code smell.

Jaap
  • 81,064
  • 34
  • 182
  • 193
Andreas
  • 1,923
  • 19
  • 24
  • You might also want to look at `data.table`'s `inrange`-function. See for an example [here](https://stackoverflow.com/a/43642205/2204410). – Jaap Mar 03 '18 at 10:58

1 Answers1

2

A possibility is to use data.table's inrange-function.

First, let's make an interval:

my.interval <- function(beg, end) data.table(beg = as.Date(beg), end = as.Date(end))
mi <- my.interval("2012-01-01", "2012-12-31")

Now you can do:

> as.Date("2012-02-01") %inrange% mi
[1] TRUE

Or define you own inrange-function:

my.inrange <- function(x, intv) data.table::inrange(as.Date(x), intv$beg, intv$end)

With that you can do:

> my.inrange("2012-02-01", mi)
[1] TRUE

As @Frank commented, you can make an infix variant of my.inrange too:

`%my.inrange%` <- my.inrange

now you can use it in the following notation as well:

"2012-02-01" %my.inrange% mi

Which is similar to the infix notation of data.table's between and inrange functions.

Jaap
  • 81,064
  • 34
  • 182
  • 193