I am learning drake to define my analysis workflow but I have trouble getting data files as dependencies.
I use the function file_in()
inside drake_plan()
but it only works if I give the path to the file directly. If I give it with the file.path()
function or with a variable storing that file path, it doesn't work.
Examples:
# preparation
library(drake)
path.data <- "data"
dir.create(path.data)
write.csv(iris, file.path(path.data, "iris.csv"))
Working plan:
# working plan
working_plan <-
drake_plan(iris_data = read.csv(file_in("data/iris.csv")),
strings_in_dots = "literals")
working_config <- make(working_plan)
vis_drake_graph(working_config)
This plan works fine, and the file data/iris.csv
is considered as a dependency
Not working plan:
# not working
notworking_plan <-
drake_plan(iris_data = read.csv(file_in(file.path(path.data, "iris.csv"))),
strings_in_dots = "literals")
notworking_config <- make(notworking_plan)
vis_drake_graph(notworking_config)
Here it is trying to read the file iris.csv
instead of data/iris.csv
.
Working but problem with dependency:
# working but "data/iris.csv" is not considered as a dependency
file.name <- file.path(path.data, "iris.csv")
notworking_plan <-
drake_plan(iris_data = read.csv(file_in(file.name)),
strings_in_dots = "literals")
notworking_config <- make(notworking_plan)
vis_drake_graph(notworking_config)
This last one works fine but the file is not considered a dependency, so drake doesn't re-run the plan if this file is changed.
So, is there a way to tell drake file dependencies from variables?