0

I'm trying to migrate and upgrade my graph to the latest version of Neo4j and make use of new features and GDS algorithms. The old LPA community detection query was as follows:

CALL algo.labelPropagation.stream(
'MATCH (p:Publication) RETURN id(p) as id',

'MATCH (p1:Publication)-[r1:HAS_WORD]->(w)<-[r2:HAS_WORD]-(p2:Publication)
WHERE r1.occurrence > 5 AND r2.occurrence > 5
RETURN id(p1) as source, id(p2) as target, count(w) as weight',

{graph:'cypher',write:false, weightProperty : "weight"})

yield nodeId, label

WITH
label, collect(algo.asNode(nodeId)) as nodes where size(nodes) > 2
MERGE (c:PublicationLPACommunity {id : label})
FOREACH (n in nodes |
 MERGE (n)-[:IN_LPA_COMMUNITY]->(c)
)

return label, nodes

I've been trying to understand the documentation for projecting a graph then performing community detection - and I think I've been close - but I just don't fully understand what is happening and how to project it correctly first in order to perform the LPA. Here is my code so far:

CALL gds.graph.project.cypher(
  'testProjection',
  'MATCH (p:Publication) RETURN id(p) AS id',
  'MATCH (p:Publication)-[r1:HAS_WORD]->(w)<-[r2:HAS_WORD]-(p2:Publication) WHERE r1.occurrence > 5 AND r2.occurrence > 5 RETURN id(p1) as source, id(p2) as target, count(w) as weight'
)
YIELD
  graphName AS graph, nodeCount AS nodes, relationshipCount AS rels, weightProperty AS weight

I think I'm mixing up elements of the projection and elements of the algorithm - I can't figure out what should happen and why. I've managed to make simple graph projections with the Publication nodes in the past - but it seems like that isn't enough information to perform the LPA algorithm.

Any help very much appreciated.

Lakeside52
  • 159
  • 1
  • 2
  • 7
  • what is the error you are getting? Can you give us sample data as well? Thanks. – jose_bacoy Aug 12 '22 at 13:02
  • Hi @jose_bacoy I'm getting various errors - the main thing is, I don't understand how to migrate this GA LPA community detection to GDS. I've tried hundreds of ways but I either am not projecting the correct graph in the first place or the GDS LPA stream is incorrect. I don't know how to include the occurrences > 5 part and if it should be to the projection or the LPA stream. – Lakeside52 Aug 12 '22 at 13:24
  • 1) It is hard for me to create my own test data 2) it is difficult to fix a problem if I cannot see it. please help me to help you. – jose_bacoy Aug 12 '22 at 13:41
  • I understand it's a difficult one to help with. The data comes from many different places and with different techniques so It's tricky to give sample data. In more general terms, I'm trying to recreate that original code (first code block) in the new format with projections and GDS. So could you describe how you might split up the projection part and then how you would stream the LPA ? – Lakeside52 Aug 12 '22 at 14:09
  • Did you try the Migration Guide? https://neo4j.com/docs/graph-data-science/1.8/appendix-b/ In fact, since you are migrating two major versions, there are two migration guides to consider. The above one guides you from old algos library to GDS 1.x, this next one guides you from GDS 1.x to 2.x: https://neo4j.com/docs/graph-data-science/2.1/appendix-b/ – Mats_SX Sep 08 '22 at 06:58
  • Actually I think your projection query looks correct. What is the issue you have with it? – Mats_SX Sep 08 '22 at 07:14

0 Answers0