Data structure for a family tree with multiple partners and siblings?

Question

I have a very basic family tree structure but I need to figure out how to make it support multiple partners and siblings without as much redundancy.

The base of the entire tree is the person that's creating the tree.

Consider this very simple structure:

{
    "name": "Me",
    "dob": "1988",
    "parents": [
        {
            "name": "Gina Carano",
            "dob": "1967"
        },
        {
            "name": "Genghis Khan",
            "dob": "1961"
        }
    ],
    "children": [
        {
            "name": "Tim",
            "dob": "1992"
        }
    ]
}

This works nicely but what if I discovered I had a half sibling named Judy (Genghis Khan loved the ladies) and a full sibling named Brian and expanded it to this?

{
    "name": "Me",
    "dob": "1988",
    "parents": [
        {
            "name": "Gina Carano",
            "dob": "1967"
        },
        {
            "name": "Genghis Khan",
            "dob": "1961"
        }
    ],
    "children": [
        {
            "name": "Tim",
            "dob": "1992"
        }
    ],
    "siblings": [
        {
            "name": "Judy",
            "dob": "1987",
            "parents": [
                {
                    "name": "Courtney Carano",
                    "dob": "1965"
                },
                {
                    "name": "Genghis Khan",
                    "dob": "1961"
                }
            ]
        },
        {
            "name": "Brian",
            "dob": "1988",
            "parents": [
                {
                    "name": "Gina Carano",
                    "dob": "1967"
                },
                {
                    "name": "Genghis Khan",
                    "dob": "1961"
                }
            ]
        }
    ]
}

This does map my 2 newfound siblings but now I have a bit of redundancy in my data, as Genghis Khan is in 3 different places. I could potentially create a one level list such as this:

[
    { "id": "1", "name": "Me", "dob": "1988", "parents": [2,3], "siblings": [4,5] },
    { "id": "2", "name": "Genghis Khan", "dob": "1961", "children": [1,4,5] },
    { "id": "3", "name": "Gina Carano", "dob": "1967", "children": [1] },
    { "id": "4", "name": "Tim", "dob": "1992", "parents" : [2,3] },
    { "id": "5", "name": "Judy", "dob": "1987", "parents": [2,6] },
    { "id": "6", "name": "Courtney Carano", "dob": "1965", "children": [5] }
]

Would this work out the same way without as much redundancy? And are there any foreseeable circumstances in which there would be any limitations in terms of mapping out multiple partners with children?

Note: I figure if I keep the initial structure, I'd have to add id keys to properly identify that Genghis Khan is the same in all 3 instances.

My end goal is mapping a pedigree tree (probably in d3.js) that is visually going to be in this manner, with a line in the middle between partners going to their children.

So with the dataset above, I'm trying to render:

You've already showed that it has less redundancy. I don't see any limitations with this format, unless you're talking about extremely large trees where the number of siblings becomes too large to be represented in Javascript. But you'd run out of memory before that. — Lars Kotthoff, Dec 02 '15 at 18:31
@LarsKotthoff - I was just thinking the first structure might be easier to parse with d3.js - any thoughts on that? If that's the case I could generate the first structure from the second one. Probably overthinking this though. — meder omuraliev, Dec 02 '15 at 18:35
Yes, D3 will need an "expanded" structure, but it should be trivial to convert the latter to the former (this is how many of the force layout examples work; they process the raw data before putting it into the force layout). — Lars Kotthoff, Dec 02 '15 at 18:49
@meder little confused when you say _ identify that Genghis Khan is the same in all 3 instances_ how will the visualization look as Genghis khan(since you want to show Genghis khan in one node) had multiple wives. Another case is genghis khan has child 1 i.e. node me...Gina Carano has also child node 1 i.e. me and "Courtney Carano" also has child node 1 i.e. me...how can this be possible.....is the dataset correct. — Cyril Cherian, Dec 03 '15 at 01:06
@Cyril - Sorry. Courtney Carano's child node should be 5 (Judy). Trying to get the tree to render like this: http://i.imgur.com/7qpslYb.png. The graphical UI part of this should pretty much render like http://www.familyecho.com/ has. — meder omuraliev, Dec 03 '15 at 03:12

score 2 · Answer 1 · answered Jan 04 '16 at 19:01

Almost all genealogy systems have IDs for people, so I wouldn't worry about adding/requiring that.

The traditional way of doing this is to have a Family node type as well as a Person node type. This allows multiple marriages and also gives you a place to connect information like marriage date, marriage place, etc.

person[ { "id": "p1", "name": "Me", "dob": "1988", "parents": "f3" }, { "id": "p2", "name": "Genghis Khan", "dob": "1961", "parents": "f1", "spouse_families": ["f2", "f3"] }, { "id": "p3", "name": "Gina Carano", "dob": "1967", "spouse_families" : ["f3"] }, { "id": "p4", "name": "Brian", "dob": "1992", "parents" : "f3" }, { "id": "p5", "name": "Judy", "dob": "1987", "parents": "f2" }, { "id": "p6", "name": "Courtney Carano", "dob": "1965", "spouse_families": ["f2"] }, {"id": "p7", "name": "Mother of Ghengis"}, {"id": "p8", "name": "Father of Ghengis"}, ]

family[ {"id":"f1","marriage date":"", "parents": ["p7", "p8"],"children":["p2"]}, {"id":"f2","marriage date":"", "parents": ["p6", "p2"],"children":["p5"]}, {"id":"f3","marriage date":"", "parents": ["p3","p2"],"children":["p1", "p4"]}, ]

This gives you a place to connect all the parents and children together without redundancy and lots of special casing. (Note: I corrected "Tim" to "Brian" in the data structure to match the graphic.)

Data structure for a family tree with multiple partners and siblings?

1 Answers1