Questions tagged [drake-r-package]

The drake R package is a Make-like pipeline toolkit. Its purpose is to enhance reproducibility, automation, speed, and scale in R-focused data science workflows. Use this tag for general questions about usage or for help optimizing and debugging drake-powered projects. For bug reports and feature requests, please post to the GitHub issue tracker.

Visit the following to learn more about the drake R package.

85 questions
1
vote
0 answers

Is there a simple way to force targets to be up to date?

I was wondering if there is a way to force targets to be up to date using drake. I wrote some new functions and now suddely all my targets are outdated. I have no idea why this happened. Now, I am very sure that I didn't change the previous…
1
vote
1 answer

"Missing files for target" error using Drake for Rmd

Been learning how to use Drake today and managed to migrate my code but not my R markdown reports yet. This report compiles fine, and produces the expected output, but also gives this error which no amount of searching has shed light on. I am using…
Syzorr
  • 587
  • 1
  • 5
  • 17
1
vote
1 answer

bind_plans and map transformation

Is it possible to use a map transformation with a grouping variable that is described in an external plan? In other words, this works for me: plan_a = drake_plan( foo = target(x + 1, transform = map(x = c(4, 5, 6))), bar = target(y + 5,…
lordbitin
  • 185
  • 1
  • 9
1
vote
1 answer

Avoiding saving cache for a target in R package drake

I've seen that by default the R package drake saves all cache for each target. Sometimes, a target is just selecting some columns from the previous target but if the data is really big, this means that you get two saved targets which are really big.…
cimentadaj
  • 1,414
  • 10
  • 23
1
vote
1 answer

Can rmarkdown return a value to a target

I find myself using rmarkdown/rnotebooks quite a bit to do exploratory analysis since I can combine code, prose and graphs. Many a times, I'll write my entire predictive modeling approach and the model itself within markdown. However, then I end up…
Rahul
  • 2,579
  • 1
  • 13
  • 22
1
vote
1 answer

How to combine multiple drake targets into a single cross target without combining the datasets?

Drake rocks! I have a complex multistage processing problem. The problem can be illustrated with this example. I have 2 processes at level l, and I want all the datasets generated by all the level 1 processes to be processed by a single target at…
Dennis
  • 332
  • 1
  • 4
  • 12
1
vote
1 answer

Working with multiple files across multiple plans in drake

I am attempting to use the drake R package to process multiple file inputs across multiple plans, so I can build up my targets iteratively, testing what works at each stage. Below is a trivial reprex showing what I am trying to accomplish. The…
rmflight
  • 1,871
  • 1
  • 14
  • 22
1
vote
2 answers

How can I get the size that a drake target takes on disk?

When need to understand my drake plan, vis_drake_graph() comes in handy, and it displays the time that each target took to run. This is very helpful in figuring out whether targets should be broken down to reduce re-run time on small changes. My…
Magnus
  • 23,900
  • 1
  • 30
  • 28
1
vote
1 answer

Exclude functions imported from packages from drake graph visualization?

I'd like to exclude nodes representing functions imported from external packages (e.g. stringr::str_sub) from the graph visualization of my Drake plan, but keep the nodes for functions that come from scripts I've sourced into the environment. How…
rushgeo
  • 103
  • 6
1
vote
1 answer

drake readd function not working for plots

I'm trying to trouble shoot why Drake plots are not showing up with readd() - the rest of the pipeline seem's to have worked though. Not sure if this is caused by minfi::densityPlot or some other reason; my thoughts are the later as it's also not…
1
vote
1 answer

Manually add dependency in drake workflow?

Let's say I have a drake plan where I create a SQL table in an external database, and after that job, I download from some table that depends on the initial job. My plan might look like this drake_plan(up_job = create_sql_file('some_input.csv'), …
pedram
  • 2,931
  • 3
  • 27
  • 43
1
vote
1 answer

Is there a way of "chunking" drake outputs to speed up plan verification and display?

I'm conducting simulations over a range of models and parameter values. At this point in time my drake workflow involves over 3k thousand simulated data.frames and corresponding stanfit objects. Trying to run make currently incurs a delay of ~2…
overdisperse
  • 416
  • 3
  • 13
1
vote
1 answer

Halting drake plan makes it rebuild targets it already had built previously

I'm currently using drake to run a set of >1k simulations. I've estimated that it would take about two days to run the complete set, but I also expect my computer to crash at any point during that period because, well, it has. Apparently stopping…
overdisperse
  • 416
  • 3
  • 13
1
vote
1 answer

Trigger notification from report generation in R drake package

I've set up a drake pipeline that generates a report at the end of the pipeline. I would like to trigger a slack notification every time a new report is created. For the report part of my plan I use the following: report_plan <- drake::drake_plan( …
Jenna Allen
  • 454
  • 3
  • 11
1
vote
2 answers

Give a character string to define a file dependency in drake

I am learning drake to define my analysis workflow but I have trouble getting data files as dependencies. I use the function file_in() inside drake_plan() but it only works if I give the path to the file directly. If I give it with the file.path()…
norival
  • 49
  • 6