0

I want to import data of a similar category from multiple source files.

Every source has a short label.

How can I incorporate this into drake, without writing out every file as its own target?

I thought the following would work, but it does not. Ideally, I would like to have the targets raw_a and raw_b.

input_files <- list(
  'a' = 'file_1.csv',
  'b' = 'file_2.csv'
)

plan <-
  drake::drake_plan(
    raw = drake::target(
      import_file(file),
      transform = map(
        file = file_in(!! input_files)
      )
    )
  )

with

import_file <- function(file) {
  readr::read_csv(file, skip = 2)
}
robust
  • 594
  • 5
  • 17
  • Update: you may be interested in dynamic files: https://github.com/ropensci/drake/pull/1178. Brand new in development `drake` (the GitHub version, `remotes::install_github("ropensci/drake")). – landau Feb 22 '20 at 13:33

3 Answers3

3

You are so close. file_in() needs to go literally in the command, not the transformation.

library(drake)
input_files <- c("file_1.csv", "file_2.csv")

plan <- drake_plan(
  raw = target(
    import_file(file_in(file)),
    transform = map(file = !!input_files)
  )
)

config <- drake_config(plan)
vis_drake_graph(config)

Created on 2019-10-19 by the reprex package (v0.3.0)

landau
  • 5,636
  • 1
  • 22
  • 50
  • Thank you! Any suggestion on how to make target name suffixes from the list names? Would something like `label = names(input_files), .id = label` be the intended way? – robust Oct 20 '19 at 08:23
  • Yes, that should do it. – landau Oct 20 '19 at 12:38
1

This is probably the idiomatic solution.

plan <-
  drake::drake_plan(
    raw = drake::target(
      import_file(file),
      transform = map(
        file = file_in('file_1.csv', 'file_2.csv'),
        label = c('a', 'b'),
        .id = label
      )
    )
  )
robust
  • 594
  • 5
  • 17
0

file_in needs to around the whole string

plan <-
  drake::drake_plan(
    raw = drake::target(
      import_file(file),
      transform = map(
        file = list(
  file_in('file_1.csv'),
  file_in('file_2.csv')
)
      )
    )
  )