0

I am trying to calculate the difference in months between two dates in R using dbplyr package, I want to send the sql query to calculate it using "timestampdiff" native function in mysql but I'm getting an error:

library(tidyverse)
library(lubridate)
library(dbplyr)     

        db_df <- tbl(con, "creditos")

        db_df %>% mutate(diff_month = timestampdiff(month, column_date_1, column_date_2))

but the parameter month is not been translating correctly because it looks like an object or function in R:

Error in UseMethod("escape") : no applicable method for 'escape' applied to an object of class "function"

And if written this way:

db_df %>% mutate(diff_month = timestampdiff("month", column_date_1, column_date_2))

I will also get an error:

You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near month, column_date_1, column_date_2) AS diff_month

And I believe this is because dbplyr is writing "month" with double quotes into mysql, and it should be without double quotes, something like this:

TIMESTAMPDIFF(month, column_date_1, column_date_2) AS `diff_month`

Or is there a better way to calculate month difference using dbplyr?

Simon.S.A.
  • 6,240
  • 7
  • 22
  • 41

1 Answers1

0

month is a function in the lubridate package. It looks like mutate is being passed month as the R function month() instead of as text.

If you are using native SQL to compute the time difference, then you should not need the lubridate package.

Two possible solutions:

  1. Remove library(lubridate) from your pre-amble and refer to lubridate packages using the prefix lubridate::. E.g.: lubridate::ymd_hms
  2. Capitalize the parts of your mutate command that you want run in native SQL. This should help the SQL translation distinguish them from the lower case varients that have other meanings. E.g.: db_df %>% mutate(diff_month = TIMESTAMPDIFF(MONTH, column_date_1, column_date_2))
Simon.S.A.
  • 6,240
  • 7
  • 22
  • 41
  • You unaccepted my answer. Please would you give a reason why? – Simon.S.A. May 04 '19 at 02:34
  • We have to use sql function from dplyr to help the code translate the SQL query, I already edit your answer – Alvaro Pabon Jun 12 '19 at 17:04
  • This is another approach but it is not essential. The example in my original answer works as expected in my context. Much of the dbplyr advice I have read encourages users to avoid the `sql` function where possible. – Simon.S.A. Jun 12 '19 at 21:00
  • Without `sql` I get an error: "Error: 'SELECT `id_credito`, `fecha_desem`, TIMESTAMPDIFF(`MONTH`, `fecha_desem`, CURDATE()) AS `diff_month` FROM creditos.desembolsos LIMIT 11' nanodbc/nanodbc.cpp:1587: 42000: [MySQL][ODBC 8.0(a) Driver][mysqld-5.6.42]You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '`MONTH`, `fecha_desem`, CURDATE()) AS `diff_month` FROM creditos.desembolsos LIM' at line 1" – Alvaro Pabon Jun 12 '19 at 21:54
  • Does it work if you replace `CURDATE()` with a hard coded date? E.g. '2019-06-12'. If so, this is different from what you originally posted. For this variation `sql()` may be needed. – Simon.S.A. Jun 12 '19 at 22:50