0

This might be a naive question but I've really tried searching multiple resources: multiprocessing and ipyparallel but these seem to be lack of appropriate information for my task.

What I have is a large directed graph G with 9 million edges and 6 million nodes. My goal is to, for a list of target nodes (50k, along with their direct neighbours (both in/out), extract subgraphs from G. I am currently using networkx to do this.

I tried to use ipyparallel but I could not find tutorial on how to share an object (in my case, G) across processors for subgraph function. Is there an easy way to parallelize this across different cpu cores (there are 56 available so I really want to make full use of it)?

Thank you!

Zhiya
  • 610
  • 2
  • 7
  • 22

1 Answers1

0

Try treating G as a database - so instead that it will be shared by all the sub-processes - they will be able to get info from it and do what they need

alikyos
  • 86
  • 1
  • 7