This might be a naive question but I've really tried searching multiple resources: multiprocessing
and ipyparallel
but these seem to be lack of appropriate information for my task.
What I have is a large directed graph G
with 9 million edges and 6 million nodes. My goal is to, for a list of target nodes (50k, along with their direct neighbours (both in/out), extract subgraphs from G
. I am currently using networkx
to do this.
I tried to use ipyparallel
but I could not find tutorial on how to share an object (in my case, G
) across processors for subgraph function. Is there an easy way to parallelize this across different cpu cores (there are 56 available so I really want to make full use of it)?
Thank you!