0

Thanks to DAVID SCHOCH, I leveraged his function and slightly modified to create the following. It works perfectly fine.

Now, all I need is to calculate these variables i.e. run functions on the graph object H in parallel, maybe using purrr or furrr, to run this faster. My real data is big with 30 more functions. The output should be same as H_indices shown in the last line of the code bolow.

library(igraph); library(sna);  library(centiserve); library(tidygraph); library(tibble); library(expm)

H <- play_islands(5, 10, 0.8, 3)

all_indices <- function(g) {
  tibble(
    degree = igraph::degree(g),
    flowbet = sna::flowbet(get.adjacency(g,sparse=F)),
    communibet = centiserve::communibet(g))
  }

H_indices <- all_indices(H)
alistaire
  • 42,459
  • 4
  • 77
  • 117
Geet
  • 2,515
  • 2
  • 19
  • 42
  • 1
    purrr is for iterating, thus so is furrr (but in parallel). Your code doesn't have any exposed iteration, so furrr is not a simple way to parallelize here. I guess you could iterate across the functions with `furrr::future_invoke_map`, if that's what you're implying? – alistaire Feb 16 '19 at 20:44
  • Is there a way to iterate over the 3 variables/functions in the tibble above and then apply furrr? – Geet Feb 16 '19 at 20:48
  • No, iterating in parallel is what furrr does, and where you'd get any speedup. My last point was that you could do `furrr::future_invoke_map(list(igraph::degree, function(x) sna::flowbet(get.adjacency(x, sparse = FALSE)), centiserve::communibet), g)` (or something similar). If one function takes a long time and the rest are negligible, you won't see mych benefit, though; it's not a natural case for parallelization. – alistaire Feb 16 '19 at 20:53
  • I am getting an error 'Error: `.x` (3) and `.y` (10) are different lengths'. Can you please check the invoke_map function syntax? And, can I use that to get the dataframe output just like H_indices? – Geet Feb 16 '19 at 21:00
  • 2
    Yeah, you probably need `list(g)` instead of `g` so it doesn't try to iterate over it. If the outputs are all named vectors of the same length or data frames, you could use `furrr::future_invoke_map_dfc` or manually call `dplyr::bind_cols`, but I suspect you'll need to do some restructuring first unless you plan really carefully. – alistaire Feb 16 '19 at 21:23
  • It worked on few package functions and not on others. For example, despite installing and loading expm package, it still throws an error: "Error: The 'expm' package needed for this function to work. Please install it." Does the function need to loaded on individual nodes/clusters or something? If yes, you know how? – Geet Feb 16 '19 at 21:31
  • You'll have to dig through the furrr/future docs. I believe you'd need to pass it through to `?future::Future` when you call `plan`. – alistaire Feb 16 '19 at 21:47

1 Answers1

0

In lack of a reprex, it's hard to give a definite solution, so the following is just to give the idea. The following should work if H is a list or vector.

library(furrr)
library(dplyr)

plan(multiprocess)

H %>% 
  future_map(all_indices,
             .options = furrr_options(seed = TRUE))
Agile Bean
  • 6,437
  • 1
  • 45
  • 53