3

I would like to monitor the progress of my mapply function. The data consists of 2 lists and there is a function with 2 arguments.

If I do something similar with a function that takes 1 arguments I can use ldply instead of lapply. (I'd like to rbind.fill the output to a data.frame)

If I want to do the same with mdply it doesn't work as the function in mdply wants values taken from columns of a data frame or array. Mapply takes lists as input.

These plyr apply functions are handy, not just because I can get the output as a data.frame but also because I can use the progress bar.

I know there is the pbapply package but that there is no mapply version and there is the txtProgressBar function but I could not figure out how to use this with mapply.

I tried to create a reproducible example (takes around 30 s to run)

I guess bad example. My l1 is a list of scraped websites (rvest::read_html) which I cannot send as a data frame to mdply. The lists really need to be lists.

mdply <- plyr::mdply

l1 <- as.list(rep("a", 2*10^6+1))
l2 <- as.list(rnorm(-10^6:10^6))

my_func <- function(x, y) {

ab <- paste(x, "b", sep = "_")
ab2 <- paste0(ab, exp(y), sep = "__")

return(ab2)

}

mapply(my_func, x = l1, y = l2)

mdply does't work

mdply(l1, l2, my_func, .progress='text')

Error in do.call(flat, c(args, list(...))) : 'what' must be a function or character string
Roccer
  • 899
  • 2
  • 10
  • 25

2 Answers2

2

From ?mdply I dare say you can't specify two data-inputs. Your error message means mdply is trying to use l2 as function but a list cannot be coerced into a function...

The following works fine

mdply(
    data.frame(x=unlist(l1), y=unlist(l2)), # create a data.frame from l1 and l2
    my_func, # your function
    .progress=plyr::progress_text(style = 3) # create a textual progress bar
)[, 3] # keep the output only

I think I've understood your purpose now:

mdply(
    .data=data.frame(r=1:length(l1)), # "fake data" (I will use them as item index)
    .fun=function(r) return(my_func(l1[[r]], l2[[r]])), # a wrapper function of your function
    .progress=plyr::progress_text(style = 3) # create a textual progress bar
)[, 2] # keep the output only

Please note I had to wrap your function with a new one which takes into account just one argument and it uses that argument to access l1 and l2

Bruno Zamengo
  • 800
  • 1
  • 11
  • 24
  • Thanks. The problem is l1 in reality is a list of webpages that I scraped using rvest::read_html. This list i cannot use as a column in data.frame. I guess the example was a bad one. – Roccer Aug 14 '17 at 13:04
  • Thanks for your help. The function runs but the output is not what I want/ what I get from mapply. I will accept your answer later as you solved by example. – Roccer Aug 14 '17 at 13:31
2

Answering my own question. There is now a function called pbmapply in pbapply that adds progress bars to mapply.

Roccer
  • 899
  • 2
  • 10
  • 25