graph data science library stream too slow & how to retrieve node label(type)?

Question

Q1. We're trying to perform random walk and I followed the example,

our graph consists of 170 million nodes, 1700 million edges and set enough memory for both heap(256GB) and page(10GB)

we set sampling ratio to sample 10000 nodes, random walk takes short time, but retrieving graph to python dataframe via stream takes forever,

is there something that I'm missing here? e.g. indexing, etc

my understanding is, gds basically

I don't think there's indexing necessary in this circumstance, I'm not an expertise in DB though.

Q2.

We're trying to stream node label from catalog graph, but I can't find any function for that.

How should I fetch node label(type)?

for Q1, after long-wait, We retrieved the graph with desired node count,

I think stream took long time because, it has 40MIL edges is there some way to limit the edges as well?

I see that there used to be walkLength, walksPerNode

is there equivalent for gds.alpha.graph.sample.rwr?

0 Answers0