4

I've been playing around with readr's read_delim_chunked functions. Based on the documentation, it's not clear how one can, or if it's possible, to pass arguments into the callback function. For instance, from the documentation example:

# Cars with 3 gears
f <- function(x, pos) {
  dplyr::filter(x, .data[["gear"]] == 3)
}

readr::read_csv_chunked(
  readr::readr_example("mtcars.csv"), 
  readr::DataFrameCallback$new(f), 
  chunk_size = 5)

# A tibble: 15 x 11
    mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
   <dbl> <int> <int> <int> <dbl> <dbl> <dbl> <int> <int> <int> <int>
 1  21.4     6   258   110  3.08 3.215 19.44     1     0     3     1
 2  18.7     8   360   175  3.15 3.440 17.02     0     0     3     2
 3  18.1     6   225   105  2.76 3.460 20.22     1     0     3     1

This works fine. But what if I wanted to parameterize the gear value? For instance,

f <- function(x, pos, gear_val) {
  dplyr::filter(x, .data[["gear"]] == gear_val)
}

readr::read_csv_chunked(
  readr::readr_example("mtcars.csv"),
  readr::DataFrameCallback$new(f, gear_val = 3),
  chunk_size = 5
)

Error in .subset2(public_bind_env, "initialize")(...) :
  unused argument (gear_val = 3)

I've tried various combinations of trying to pass a parameter through to the callback function, but it doesn't work. Does anyone have any ideas on how to do this?

pogibas
  • 27,303
  • 19
  • 84
  • 117
TinyHeero
  • 580
  • 1
  • 4
  • 18

1 Answers1

7

You would use a functional / function factory in this case, e.g.

f <- function(gear_val) {
  function(x, pos) {
    dplyr::filter(x, .data[["gear"]] == gear_val)
  }
}

readr::read_csv_chunked(
  readr::readr_example("mtcars.csv"),
  readr::DataFrameCallback$new(f(gear_val = 3)),
  chunk_size = 5
)
#> Parsed with column specification:
#> cols(
#>   mpg = col_double(),
#>   cyl = col_double(),
#>   disp = col_double(),
#>   hp = col_double(),
#>   drat = col_double(),
#>   wt = col_double(),
#>   qsec = col_double(),
#>   vs = col_double(),
#>   am = col_double(),
#>   gear = col_double(),
#>   carb = col_double()
#> )
#> # A tibble: 15 x 11
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1  21.4    6.  258.  110.  3.08  3.22  19.4    1.    0.    3.    1.
#>  2  18.7    8.  360.  175.  3.15  3.44  17.0    0.    0.    3.    2.
#>  3  18.1    6.  225.  105.  2.76  3.46  20.2    1.    0.    3.    1.
#>  4  14.3    8.  360.  245.  3.21  3.57  15.8    0.    0.    3.    4.
#>  5  16.4    8.  276.  180.  3.07  4.07  17.4    0.    0.    3.    3.
#>  6  17.3    8.  276.  180.  3.07  3.73  17.6    0.    0.    3.    3.
#>  7  15.2    8.  276.  180.  3.07  3.78  18.0    0.    0.    3.    3.
#>  8  10.4    8.  472.  205.  2.93  5.25  18.0    0.    0.    3.    4.
#>  9  10.4    8.  460.  215.  3.00  5.42  17.8    0.    0.    3.    4.
#> 10  14.7    8.  440.  230.  3.23  5.34  17.4    0.    0.    3.    4.
#> 11  21.5    4.  120.   97.  3.70  2.46  20.0    1.    0.    3.    1.
#> 12  15.5    8.  318.  150.  2.76  3.52  16.9    0.    0.    3.    2.
#> 13  15.2    8.  304.  150.  3.15  3.44  17.3    0.    0.    3.    2.
#> 14  13.3    8.  350.  245.  3.73  3.84  15.4    0.    0.    3.    4.
#> 15  19.2    8.  400.  175.  3.08  3.84  17.0    0.    0.    3.    2.

Created on 2018-03-12 by the reprex package (v0.2.0).

Jim
  • 4,687
  • 29
  • 30