0

Here is a very small subset of my dataset:

db_country <- tibble(country = c("Argentina", "Australia", "Austria"),
                     region = c("Americas", "Asia", "Europe"),
                     start_date = as.numeric(18487, 18487, 18487))

# A tibble: 3 x 3
  country   region   start_date
  <chr>     <chr>         <dbl>
1 Argentina Americas      18487
2 Australia Asia          18487
3 Austria   Europe        18487

As you can see the start_date column values are in Unix Epoch time. I want to change these to regular modern-day dates. My actual dataset has many tables with many rows and columns which require conversion.

So rather than running multiple long lines of code, I want to create my own function in R which does the same thing but in fewer characters. Usually, I would do something like this:

db_country <- db_country %>% mutate(start_date = as_date(start_date))

Since I want to make a shortcut function I tried the following but they gave me errors:

(I did load the tidyverse and lubridate packages)

mydate1 <- function(dataset, column) {
  dataset <- dataset %>% mutate(column = as_date(column))
}

mydate1(db_country, start_date)

# Error: Problem with `mutate()` input `column`.
# x error in evaluating the argument 'x' in selecting a method for function 'as_date':
#  object 'start_date' not found
# i Input `column` is `as_date(column)
mydate2 <- function(dataset, column) {
  dataset$column <- as_date(dataset, dataset$column)
}

mydate2(db_country, start_date)

# Error in as.Date.default(x, ...) : 
#  do not know how to convert 'x' to class “Date” 
mydate3 <- function(dataset, column) {
  dataset$column <- as.Date.numeric(dataset, dataset$column)

mydate3(db_country, start_date)

# Error in as.Date(origin, ...) + x : 
#  non-numeric argument to binary operator
# In addition: Warning messages:
# 1: Unknown or uninitialised column: `column`. 
# 2: In as.Date.numeric(dataset, dataset$column) :
#   Incompatible methods ("+.Date", "Ops.data.frame") for "+"

I would really appreciate any help or advice with this :)

kiwi
  • 565
  • 3
  • 11

1 Answers1

1

You have to use non-standard evaluation (NSE) while referrring column names in function.

If you want to pass unquoted names in the function use {{}} :

library(dplyr)
library(lubridate)
library(rlang)

mydate1 <- function(dataset, column) {
  dataset %>% mutate({{column}} := as_date({{column}}))
}

mydate1(db_country, start_date)
# A tibble: 3 x 3
#  country   region   start_date
#  <chr>     <chr>    <date>    
#1 Argentina Americas 2020-08-13
#2 Australia Asia     2020-08-13
#3 Austria   Europe   2020-08-13

If you want to pass quoted names change the function to :

mydate1 <- function(dataset, column) {
  dataset %>% mutate(!!column := as_date(.data[[column]]))
}

mydate1(db_country, 'start_date')
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • thanks a lot for your answer :) question - for the first function why do you have you put a colon before the equals symbol? (like this `:=`) – kiwi Oct 23 '20 at 11:26
  • 1
    because here we want to evaluate the variable `column` and not take it literally. For example, when we use - `db_country %>% mutate(column = 1:3)` there is a new column named `column` which is created but in the function above we don't want column named `column` but we want to evaluate whatever value is stored in `column` variable and use that as new column name. – Ronak Shah Oct 23 '20 at 11:34