5

I'm trying to use Neo4j to analyze relationships in a family tree. I've modeled it like so:

(p1:Person)-[:CHILD]->(f:Family)<-[:FATHER|MOTHER]-(p2)

I know I could have left out the family label and just had children connected to each parent, but that's not practical for my purposes. Here's an example of my graph and the black line is the path I want it to generate:

Full tree

I can query for it with

MATCH p=(n {personID:3})-[:CHILD]->()<-[:FATHER|MOTHER]-()-[:CHILD]->()<-[:FATHER|MOTHER]-()-[:CHILD]->()<-[:FATHER|MOTHER]-() RETURN p

but there's a repeating pattern to the relationships. Could I do something like:

MATCH p=(n {personID:3})(-[:CHILD]->()<-[:FATHER|MOTHER]-())* RETURN p

where the * means repeat the :CHILD then :FATHER|MOTHER relationships, with the directions being different? Obviously if the relationships were all the same direction, I could use

-[:CHILD|FATHER|MOTHER*]->

I want to be able to query it from Person #3 all the way to the top of the graph like a pedigree chart, but also be specific about how many levels if needed (like 3 generations, as opposed to end-of-line).

Another issue I'm having with this, is if I don't put directions on the relationships like -[:CHILD|FATHER|MOTHER*]-, then it will start at Person #3, and go both in the direction I want (alternating arrows), but also descend back down the chain finding all the other "cousins, aunts, uncles, etc.".

Any seasoned Cypher experts that an help me?

Community
  • 1
  • 1
tralston
  • 2,825
  • 3
  • 19
  • 22

4 Answers4

3

I am just on the same problem. And I found out that the APOC Expand path procedures are just accomplishing what you/we want.

Applied to your example, you could use apoc.path.subgraphNodes to get all ancestors of Person #3:

MATCH (p1:Person {personId:3})
CALL apoc.path.subgraphNodes(p1, {
    sequence: '>Person,CHILD>,Family,<MOTHER|<FATHER'
}) YIELD node
RETURN node

Or if you want only ancestors up to the 3 generations from start person, add maxLevel: 6 to config (as one generation is defined by 2 relationships, 3 generations are 6 levels):

MATCH (p1:Person {personId:3})
CALL apoc.path.subgraphNodes(p1, {
    sequence: '>Person,CHILD>,Family,<MOTHER|<FATHER',
    maxLevel: 6
}) YIELD node
RETURN node

And if you want only ancestors of 3rd generation, i.e. only great-grandparents, you can also specify minLevel (using apoc.path.expandConfig):

MATCH (p1:Person {personId:3})
CALL apoc.path.expandConfig(p1, {
    sequence: '>Person,CHILD>,Family,<MOTHER|<FATHER',
    minLevel: 6,
    maxLevel: 6
}) YIELD path
WITH last(nodes(path)) AS person
RETURN person
Brakebein
  • 2,197
  • 1
  • 16
  • 21
1

You could reverse the directionality of the CHILD relationships in your model, as in:

(p1:Person)<-[:CHILD]-(f:Family)<-[:FATHER|MOTHER]-(p2)

This way, you can use a simple -[:CHILD|FATHER|MOTHER*]-> pattern in your queries.

Reversing the directionality is actually intuitive as well, since you can then more naturally visualize the graph as a family tree, with all the arrows flowing "downwards" from ancestors to descendants.

cybersam
  • 63,203
  • 6
  • 53
  • 76
  • That did cross my mind. It would most likely simplify the model. I am still interested in the more global concept of what to do in a situation like this, for example, in a graph model where you don't have the ability to change relationship directions or the name of it. The above you suggested would be `(p1:Person)<-[:HAS_CHILD]-(f:Family)<-[:FATHER|MOTHER]-(p2)` assuming directionality factors into edge naming. I prefer to be in the "active voice" so to speak when doing this. – tralston May 20 '15 at 18:37
0

Yeah, that's an interesting case. I'm pretty sure (though I'm open to correction) that this is just not possible. Would it be possible for you to have and maintain both? You could have a simple cypher query create the extra relationships:

MATCH (parent)-[:MOTHER|FATHER]->()<-[:CHILD]-(child)
CREATE (child)-[:CHILD_OF]->parent
Brian Underwood
  • 10,746
  • 1
  • 22
  • 34
  • I can create the direct relationships no problem, as I'm importing all my data from an SQLite db. The problem is when doing it parent-child directly, it gets really messy with multiple spouses and step/half siblings. That's why I designated one Family unit to have parents and children. In normal database modeling, you can have a ternary or n-ary relationship/role, but according to Neo4j, you can't do that easily. A two-party relationship has to be objectified first and then it can participate in a relationship with another object. Hence the Family unit. – tralston May 20 '15 at 18:05
0

Ok, so here's a thought:

MATCH path=(child:Person {personID: 3})-[:CHILD|FATHER|MOTHER*]-(ancestor:Person),
WHERE ancestor-[:MOTHER|FATHER]->()
RETURN path

Normally I'd use a second clause in the MATCH like this:

MATCH
  path=(child:Person {personID: 3})-[:CHILD|FATHER|MOTHER*]-(ancestor:Person),
  ancestor-[:MOTHER|FATHER]->()
RETURN path

But Neo4j (at least by default, I think) doesn't traverse back through the path. Maybe comma-separating would be fine and this would be a problem:

MATCH path=(child:Person {personID: 3})-[:CHILD|FATHER|MOTHER]-(ancestor:Person)-[:MOTHER|FATHER]->()

I'm curious to know what you find!

Brian Underwood
  • 10,746
  • 1
  • 22
  • 34
  • That's a nice touch, adding the constraint like that. Two things: first, as modeled, it still needs to go through a :Family node before getting to another :Person node. Second, I got errors from your second snippet about wanting a comma in there. Both return an empty set. – tralston May 20 '15 at 18:34
  • Ah, right. I added the labels (I was looking at your second query and didn't see labels, so I thought you didn't have any) and the comma. Does that work? – Brian Underwood May 20 '15 at 18:47