For example:
require(RevoScaleR)
# Create a data frame
set.seed(100)
myData = data.frame(x = 1:100, y = rep(c("a", "b", "c", "d"), 25),
z = rnorm(100), w = runif(100))
# Create a multi-block .xdf file from the data frame
inputFile = file.path(tempdir(), "testInput.xdf")
rxDataStep(inData = myData, outFile = inputFile, rowsPerRead = 50,
overwrite = TRUE)
# Square the values in the column "z"; this works fine
rxDataStep(inData = inputFile, outFile = inputFile, overwrite = TRUE,
transforms = list(z = z^2))
# Define a squaring function and try to use it to repeat the previous step:
myFun = function(x) x^2
rxDataStep(inData = inputFile, outFile = inputFile, overwrite = TRUE,
transforms = list(z = myFun(z)))
The final step crashes with the error
Error in transformation function: Error in eval(expr, envir, enclos) : could not find function "myFun"
The documentation for rxDataStep
states that "As with all expressions, transforms ... can be defined outside of the function call using the expression
function." But I have no idea how to implement this advice, and can't find an example. For instance, the following does not work:
myFun = expression(function(x) x^2)
rxDataStep(inData = inputFile, outFile = inputFile, overwrite = TRUE,
transforms = list(z = myFun(z)))