4

This is more of a "how to" question, there can be different ways but trying to find the most performant and effective way of solving this requirement.

I have a graph where there are nodes that act as fork nodes i.e. they spawn into two paths and these paths later meet at some other node, I know the node id and properties of the fork node (Node A in example below) and would like to know the node (Node B in ex below) at which the two paths meet.

Note - these paths can be of variable length i.e. one may have 6 nodes and the other only 2 nodes for e.g.

NodeA -[]-> Node 1 -[]-> Node 2 -[]-> Node 3 -[]-> Node 4 -[]-> Node 5 -[]-> Node 6 -[]-> Node B -[]-> Node C -[]-> Node D -[]-> Node E
NodeA -[]-> Node 7 -[]-> Node 8 -[]-> Node B -[]-> Node C -[]-> Node D -[]-> Node E

So if you see Node A spawns into two paths which finally meet again at Node B, so I want to get the Node B knowing Node A, pls suggest how we can do this in Cypher.

Thanks, Deepesh

deepesh
  • 513
  • 1
  • 7
  • 18
  • It's an interesting question. Can you please clarify though? Are you looking for paths only which diverge starting at A, or is it OK if they diverge somewhere else along the path and share a few steps before that happens? Is it OK if the paths overlap (somewhere in the middle they share a step)? – FrobberOfBits Dec 24 '14 at 13:39
  • Before Node A, its one path, may be something like X > Y > Z > A etc. At Node A it splits into two paths and then they meet at some Node B, the length of path is variable, the relationship name between all these nodes is [:NEXT] – deepesh Dec 24 '14 at 15:33
  • To answer your question on " if they diverge somewhere else along the path" - no because the point at which they diverge will be the node "A", but good line of thought - thanks for asking – deepesh Dec 24 '14 at 15:36

2 Answers2

4

I'm going to assume that you know node a via it's id property and that it's 1234. Also you may want to use labels here. I'm not sure if you're using them or not, so I left them out.

MATCH
  (a)-[*1..10]->(b),
  (a)-[*1..10]->(b)
WHERE a.id = 1234
RETURN b

You could return the lengths of the paths too, but this should get you the result. Also note that you can adjust the max length of the path (10 in this example) as a tradeoff on the performance of the query (it depends on the structure of your graph)

EDIT:

Also, if that doesn't work you may need to do:

MATCH
  path1 = (a)-[*1..10]->(b),
  path2 = (a)-[*1..10]->(b)
WHERE a.id = 1234 AND path1 <> path2
RETURN b
Brian Underwood
  • 10,746
  • 1
  • 22
  • 34
  • It's a neat question, and I like your answer. (The second one I mean). I think you have to say that `path1<>path2`, otherwise your first query raises the possibility it'll traverse the same path twice. – FrobberOfBits Dec 24 '14 at 13:36
  • But another kink of this - in your second query, the paths don't have to diverge at A. Say we had (A->n1->n3->B) and then (A->n1->n2->n4->B). Notice how the paths don't diverge until n1, not diverging at A. Your query won't catch that case, but I'm not sure it matters to OP. – FrobberOfBits Dec 24 '14 at 13:37
  • 1
    That's a good point. In that case I think you might need to use the nodes() method on the paths, though I don't know if you can do an intersection on arrays in cypher. – Brian Underwood Dec 24 '14 at 14:38
  • 1
    As for the `path1<>path2`, I'm pretty sure that cypher will automatically do that for you when you specify two nodes with the same label in different parts of your `MATCH` (I think maybe it didn't used to), but I wasn't sure if that happened for paths – Brian Underwood Dec 24 '14 at 14:39
  • It seems like your second query should work for OP so I upvoted. I wish OP would be a bit more specific about the question though because there might be a much harder question in there somewhere. – FrobberOfBits Dec 24 '14 at 14:41
  • I just realized if the path is 1..10, and you had a "figure 8" graph where there was more than one point of intersection, you might also need a shortestPath in there somewhere. There could be many more than one B depending on how the graph was shaped. – FrobberOfBits Dec 24 '14 at 14:42
  • Thanks Brian, this query works perfect for me, the only problem that I see is that it returns node b as multiple nodes i.e. it returns B, C, D, E, F etc i.e. it does not stop at the intersection point, I have updated the example in question with nodes B, C, D, E, F etc – deepesh Dec 24 '14 at 15:48
  • 1
    Probably `path1 = shortestPath((a)-[*1..10]->(b))` for both paths would work – Brian Underwood Dec 24 '14 at 16:01
  • Added RETURN b LIMIT 1 to get the intersection and it works now, thanks Brian/Frobber, I will post an answer to the question also. – deepesh Dec 24 '14 at 16:02
  • Btw - tried the shortest path, its giving error as shortestPath(...) does not support a minimal length (line 1, column 7) ......... MATCH shortestPath((a)-[:NEXT*1..10]->(b)) WHERE a.block_id = '1234' and a <> b RETURN b LIMIT 1 – deepesh Dec 24 '14 at 16:14
  • Oh, yeah, that means you need to do `path1 = (a)-[*..10]->(b)` (basically you need to take away that minimal number, 1 in this case – Brian Underwood Dec 24 '14 at 16:15
  • Just a note here - the shortest path logic doesnt seem to work because it only returns the shortest path among multiple paths i.e. all nodes in that path but does not stop at the intersection node and neither does it return the intersection node... – deepesh Dec 26 '14 at 20:18
1

The answer to my question is as below (thanks to Brian - updating his answer with a LIMIT 1)

MATCH
path1 = (a)-[*1..10]->(b),
path2 = (a)-[*1..10]->(b)
WHERE a.id = 1234 AND path1 <> path2
RETURN b LIMIT 1

Adding LIMIT 1 to return only the intersection node otherwise it returns all the nodes following the intersection node also.

deepesh
  • 513
  • 1
  • 7
  • 18