2

I have a Neo4j database comprising Film and Person nodes connected by ACTED_IN relationships. Using APOC, I've managed to create a set of virtual ACTED_WITH relationships directly between Person nodes reflecting the fact that they're indirectly related through a Film in which they both appeared:

MATCH (a:Person)-[:ACTED_IN]->()<-[:ACTED_IN]-(b:Person)
WITH a, b
CALL apoc.create.vRelationship(a, 'ACTED_WITH', {}, b) YIELD rel
RETURN a, rel, b

This seems to capture the indirect relationship I'm looking for. Now I want to stream the results of this to Gephi for visualisation. But the relevant APOC function takes a paths argument. So this works:

MATCH path = ()-[:ACTED_IN]->()
WITH collect(path) AS paths
CALL apoc.gephi.add(null, 'workspace0', paths) YIELD nodes, relationships, time
RETURN nodes, relationships, time

How can I create a set of paths from the virtual relationships yielded in the first code block for passing to Gephi as in the second? (Or is there a better way to deal with this kind of case?)

rjww
  • 23
  • 3

1 Answers1

0

The data for the apoc.gephi.add procedure can be instanceof: Node, Relationship, Path, Iterable, Map, Iterator, Object.

So it should work like this:

MATCH (a:Person)-[:ACTED_IN]->()<-[:ACTED_IN]-(b:Person) WHERE ID(a) > ID(b) 
WITH DISTINCT a, b
CALL apoc.create.vRelationship(a, 'ACTED_WITH', {}, b) YIELD rel AS rel1
CALL apoc.create.vRelationship(b, 'ACTED_WITH', {}, a) YIELD rel AS rel2
WITH collect(rel1) + collect(rel2) + collect(a) + collect(b) AS data
CALL apoc.gephi.add(null, 'workspace0', data) YIELD nodes, relationships, time
RETURN nodes, relationships, time

Update: If you need weights, then you can simply aggregate by the films:

MATCH (a:Person)-[:ACTED_IN]->(M:Movie)<-[:ACTED_IN]-(b:Person) WHERE ID(a) > ID(b) 
WITH a, b, 
     count(M) as vWeight
CALL apoc.create.vRelationship(a, 'ACTED_WITH', {weight: vWeight}, b) YIELD rel AS rel1
CALL apoc.create.vRelationship(b, 'ACTED_WITH', {weight: vWeight}, a) YIELD rel AS rel2
WITH collect(rel1) + collect(rel2) + collect(a) + collect(b) AS data
CALL apoc.gephi.add(null, 'workspace0', data) YIELD nodes, relationships, time
RETURN nodes, relationships, time
stdob--
  • 28,222
  • 5
  • 58
  • 73
  • 1
    This is great, thank you. Kind of an extension to my original question, but is there a good way to collapse parallel relationships between nodes together? (What I mean is duplicated connections between nodes created when two actors appear in more than one film together.) – rjww Aug 16 '18 at 11:56
  • @rjww Depends on your goals. In my opinion, there must be one relationship with the property by the number of common films. – stdob-- Aug 16 '18 at 13:05
  • Yes, ideally there would be a single virtual relationship (or reciprocal pair, I guess, given that Neo4j graphs are directed) between co-appearing actors with a weight property that is a count of the number of films in which they appear together. The end goal is to get something like that into Gephi so that I can create some exploratory visualisations with different measures of centrality, etc. – rjww Aug 16 '18 at 13:56
  • @rjww For weights use the aggregation function `count`. See the addition to the answer. – stdob-- Aug 16 '18 at 14:12
  • What is the possible solution for merging the duplicate virtual relationship between two nodes? – daudichya Dec 23 '19 at 11:55
  • @daudichya Can you clarify your question with an example? – stdob-- Dec 23 '19 at 13:31
  • @stdob please go through below the link https://stackoverflow.com/questions/59465211/how-to-group-or-merge-virtual-relationship-created-using-apoc-create-vrelationsh Thanks for your interest! – daudichya Dec 25 '19 at 11:44