I am creating a function that zips through the data frame and spreads a factor
variable to new dummy variables since some machine learning algorithms can not handle Factors. To do that, I use the spread()
function inside the cleaning function.
When I try to pass a name of a column I need to spread, however, it throws an error:
Error: Invalid column specification
Here is the code:
library(tidyr)
library(dplyr)
library(C50) # this is one source for the churn data
data(churn)
f <- function(df, name) {
df$dummy <- c(1:nrow(df)) # create dummy variable with unique values
df <- spread(df, key <- as.character(substitute(name)), "dummy", fill = 0 )
}
churnTrain = f(churnTrain, name = "state")
str(churnTrain)
Of course, if I replace key = as.character(substitute(name))
with key = "state"
it works just fine but the whole function loses its reusability.
How to pass column name to inner function without error?