3

I am running into this wall regarding bidirectional relationships.

Say I am attempting to create a graph that represents a family tree. The problem here is that:
* Timmy can be Suzie's brother, but
* Suzie can not be Timmy's brother.

Thus, it becomes necessary to model this in 2 directions:

enter image description here

(Sure, technically I could say SIBLING_TO and leave only one edge...what I'm not sure what the vocabulary is when I try to connect a grandma to a grandson.)

When it's all said and done, I pretty sure there's no way around the fact that the direction matters in this example.

I was reading this blog post, regarding common Neo4j mistakes. The author states that this bidirectionality is not the most efficient way to model data in Neo4j and should be avoided.

And I am starting to agree. I set up a mock set of 2 families:
enter image description here and I found that a lot of queries I was attempting to run were going very, very slow. This is because of the 'all connected to all' nature of the graph, at least within each respective family.

My question is this:
1) Am I correct to say that bidirectionality is not ideal?

2) If so, is my example of a family tree representable in any other way...and what is the 'best practice' in the many situations where my problem may occur?

3) If it is not possible to represent the family tree in another way, is it technically possible to still write queries in some manner that gets around the problem of 1) ?

Thanks for reading this and for your thoughts.

Monica Heddneck
  • 2,973
  • 10
  • 55
  • 89
  • bidirectional links of the same edge name are redundant and don't add value. brother-to and sister-to conveys some information, though that could be inferred from a property. a (child)-[:PARENT]->(parent) relationship gets you the parent/child relationship, and gets you the entire biological family relationship, and you can use it for every generation of parents / kids. Step kids would be a different matter. – Tim Kuehn Apr 27 '16 at 02:44

2 Answers2

1

Storing redundant information (your bidirectional relationships) in a DB is never a good idea. Here is a better way to represent a family tree.

To indicate "siblingness", you only need a single relationship type, say SIBLING_OF, and you only need to have a single such relationship between 2 sibling nodes.

To indicate ancestry, you only need a single relationship type, say CHILD_OF, and you only need to have a single such relationship between a child to each of its parents.

You should also have a node label for each person, say Person. And each person should have a unique ID property (say, id), and some sort of property indicating gender (say, a boolean isMale).

With this very simple data model, here are some sample queries:

  1. To find Person 123's sisters (note that the pattern does not specify a relationship direction):

    MATCH (p:Person {id: 123})-[:SIBLING_OF]-(sister:Person {isMale: false})
    RETURN sister;
    
  2. To find Person 123's grandfathers (note that this pattern specifies that matching paths must have a depth of 2):

    MATCH (p:Person {id: 123})-[:CHILD_OF*2..2]->(gf:Person {isMale: true})
    RETURN gf;
    
  3. To find Person 123's great-grandchildren:

    MATCH (p:Person {id: 123})<-[:CHILD_OF*3..3]-(ggc:Person)
    RETURN ggc;
    
  4. To find Person 123's maternal uncles:

    MATCH (p:Person {id: 123})-[:CHILD_OF]->(:Person {isMale: false})-[:SIBLING_OF]-(maternalUncle:Person {isMale: true})
    RETURN maternalUncle;
    
cybersam
  • 63,203
  • 6
  • 53
  • 76
  • This answer was exactly what I needed to hear -- thank you so much. Although the idea behind directionality is still sketchy in my head (why apply direction if it may or may not be irrelevant?), the idea of reducing redundancy so that you have the minimum number of relationships is absolutely golden. The example queries are fantastic. Thanks again. – Monica Heddneck Apr 29 '16 at 02:25
0

I'm not sure if you are aware that it's possible to query bidirectionally (that is, to ignore the direction). So you can do:

MATCH (a)-[:SIBLING_OF]-(b)

and since I'm not matching a direction it will match both ways. This is how I would suggest modeling things.

Generally you only want to make multiple relationships if you actually want to store different state. For example a KNOWS relationship could only apply one way because person A might know person B, but B might not know A. Similarly, you might have a LIKES relationship with a value property showing how much A like B, and there might be different strengths of "liking" in the two directions

Brian Underwood
  • 10,746
  • 1
  • 22
  • 34
  • Would you agree that 'granddaughter of ' and 'grandmother of' are different states, and thus necessitate multiple relationships? – Monica Heddneck Apr 27 '16 at 17:19
  • I don't think so... I would say that those two relationships are different ways to represent the same state of affairs in your database. This is an aspect of modeling in Neo4j. You need to pick one and stick with it – Brian Underwood Apr 27 '16 at 17:26
  • Better still, though, if it makes sense for you, is to find grandparents by matching of a path of two child_of relationships. But you don't always have that and you still want to represent the grandchild relationship – Brian Underwood Apr 27 '16 at 17:28
  • Your second comment was a bit of a mindblower. Didn't think of that at all. I'm getting the hunch that I should model a family tree with unidirectional relationships and see if I can get good enough with Cypher to get what I want out of it. – Monica Heddneck Apr 27 '16 at 17:30