0

I have a dataframe of services bookings. Each booking has a contract start and end date. For a given reporting date, I want to determine if the contract is active and, if so, how much to bill based on the monthly billing rate. If the contract ends mid-month, I pro-rate the billing for the final month. Here's the dataframe:

> bookings
     Account Service  MonthlyRate ContractStart ContractEnd
     1 A       W              50 2018-01-01    2018-12-31 
     2 A       X              75 2018-03-15    2019-03-14 
     3 B       W              60 2018-02-28    2018-09-30 
     4 B       X              90 2018-05-12    2019-08-11 
     5 B       Y              45 2018-02-28    2018-09-30 
     6 C       Y              50 2018-07-31    2019-04-30 
     7 D       W              65 2019-01-01    2019-03-31 
     8 D       Y              50 2018-09-01    2019-05-31 
     9 D       Z             110 2018-08-22    2019-12-31 
    10 E       Z             100 2018-10-01    2019-09-30 

I've written a function using lubridate to calculate the monthly billing.

    monthly_revenue <- function(reporting_date, monthly_rate, start, end) {
      contract_int <- interval(start, end) # Contract interval
      # Calculate interval ending the last day of the month of contract end
      end_of_month <- end
      day(end_of_month) <- days_in_month(end)
      end_of_month_int <- interval(start, end_of_month)
      # Check if reporting date is within contract interval
      if(reporting_date %within% contract_int) {
        val <- 1 # bill for entire month
        # If not within interval, check if contract is in its last month
      } else if (reporting_date %within% end_of_month_int) {
        val <- day(end) / days_in_month(end) # prorate monthly charges
      } else { # Not within contract
        val <- 0 # zero revenue
      }
      val * monthly_rate
    }

I then set a billing date and apply the function rowwise to the data frame:

    billing_date <- as.Date("2019-03-29")
    revenue_for_month <-bookings %>%
      rowwise() %>%
      mutate(Revenue = monthly_revenue(billing_date, MonthlyRate, ContractStart, ContractEnd))

Which results in the following error:

   Error in mutate_impl(.data, dots) : 
      Evaluation error: non-numeric argument to binary operator.

I can't tell if the problem is with my function or how I'm iterating. Any help would be sincerely appreciated.

[follow-up based on comments received] I am using the following library calls:

library(tidyverse)
library(lubridate)

And here is the dput output for my dataframe:

> dput(bookings)
structure(list(Account = c("A", "A", "B", "B", "B", "C", "D", 
"D", "D", "E"), Type = c("W", "X", "W", "X", "Y", "Y", "W", "Y", 
"Z", "Z"), MonthlyRate = c(50L, 75L, 60L, 90L, 45L, 50L, 65L, 
50L, 110L, 100L), ContractStart = structure(c(NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_), class = "Date"), ContractEnd = structure(c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_), class = "Date")), .Names = c("Account", 
"Type", "MonthlyRate", "ContractStart", "ContractEnd"), row.names = c(NA, 
-10L), spec = structure(list(cols = structure(list(Account = structure(list(), class = c("collector_character", 
"collector")), Type = structure(list(), class = c("collector_character", 
"collector")), MonthlyRate = structure(list(), class = c("collector_integer", 
"collector")), ContractStart = structure(list(), class = c("collector_character", 
"collector")), ContractEnd = structure(list(), class = c("collector_character", 
"collector"))), .Names = c("Account", "Type", "MonthlyRate", 
"ContractStart", "ContractEnd")), default = structure(list(), class = c("collector_guess", 
"collector"))), .Names = c("cols", "default"), class = "col_spec"), class = c("tbl_df", 
"tbl", "data.frame"))
EPBaron
  • 11
  • 3
  • 4
    Please share the dataframe with `dput` – Sonny Mar 30 '19 at 06:33
  • 3
    Please specify all non-base packages with `library()` calls. What is e.g. `%>%`, `interval`? – jay.sf Mar 30 '19 at 07:10
  • I have to agree with the previous comments, I believe you need to provide the structure of your data as well as the libraries you are using, because when I tried your code with the sample data you provided, I didn't get the error, however I made some assumptions first like converting the columns `ContractStart` and `ContractEnd` with `lubridate::ymd()` and using `lubridate::interval` – DS_UNI Mar 30 '19 at 11:19
  • I have added the library calls as well as the output of `dput()`. @DS_UNI, you are correct that I have applied `lubridate::ymd()` to the ContractStart and ContractEnd fields, and my intervals are definded using the `lubridate::interval` function. As you can likely tell, I'm not a coder by profession. I'm not sure how much code is useful to share, so I appreciate your feedback. – EPBaron Mar 30 '19 at 19:03
  • The output of `dput()` indicates that the columns `ContractStart` and `ContractEnd` are `NA` s , which is what causing your problem, the result of `reporting_date %within% contract_int` would be in this case `NA` which is not acceptable as an input-condition to `if`, you should go over the steps you went through to get the data, something there is not working well – DS_UNI Mar 30 '19 at 20:17
  • 1
    Thanks @DS_UNI. I have met the enemy, and he is me. In my real dataset, I was using data from a Salesforce report. The currency is output in USD, with `$` and `,` for the thousands separator. I hadn't realized that it would be interpreted as a string or, if I explicitly tried to coerce it with `as.numeric`, it would become `NA`. I have added the line `as.numeric(gsub('[$,]', '', bookings$MonthlyRate))`, and all is well with the world. Very sorry to have not caught this on my own, and thanks for the help. – EPBaron Mar 31 '19 at 05:59

1 Answers1

0

I've changed your function up quite a bit, because I ran into numerous issues. Now it works for me:

monthly_revenue <- function(reporting_date, monthly_rate, start, end) {
  contract_int <- interval(start, end) # Contract interval
  EoM_int <- interval(start, ceiling_date(as_date(end),unit="month")-1)

  reporting_date <- as_datetime(reporting_date)

  if(reporting_date %within% contract_int) {
    val <- 1 # bill for entire month
    # If not within interval, check if contract is in its last month
  } else if (reporting_date %within% EoM_int) {
    val <- day(end) / day(ceiling_date(as_date(end),unit="month")-1) # prorate monthly charges
  } else { # Not within contract
    val <- 0 # zero revenue
  }
  return(val * monthly_rate)
}

Your dplyr code is correct and runs fine.

Julian_Hn
  • 2,086
  • 1
  • 8
  • 18