15

I've been searching around how to plot a family tree but couldn't find something i could reproduce. I've been looking in Hadley's book about ggplot but the same thing.

I want to plot a family tree having as a source a dataframe similar to this:

familyTree <- data.frame(
  id = 1:6,
  cnp = c("11", NA, "22", NA, NA, "33"),
  last_name = c("B", "B", "B", NA, NA, "M"),
  last_name_alyas = rep(c(NA, "M"), c(5L, 1L)),
  middle_name = rep(c("C", NA), c(1L, 5L)),
  first_name = c("Me", "P", "A", NA, NA, "S"),
  first_name_alyas = rep(c(NA, "F"), c(5L, 1L)),
  maiden_name = c(NA, NA, "M", NA, NA, NA),
  id_father = c(2L, 4L, 6L, NA, NA, 8L),
  id_mother = c(3L, 5L, 7L, NA, NA, 9L),
  birth_date = c("1986-01-01", "1963-01-01", "1964-01-01", NA, NA, "1936-01-01"),
  birth_place = c("City", "Village", "Village", NA, NA, "Village"),
  death_date = c("0000-00-00", NA, NA, NA, NA, "2007-12-23"),
  death_reason = rep(c(NA, "stroke"), c(5L, 1L)),
  nr_brothers = c(NA, 1L, NA, NA, NA, NA),
  brothers_names = c(NA, "M", NA, NA, NA, NA),
  nr_sisters = c(1L, NA, 1L, NA, NA, 2L),
  sisters_names = c("A", NA, "E", NA, NA, NA),
  school = c(NA, "", "", NA, NA, ""),
  occupation = c(NA, "", "", NA, NA, ""),
  diseases = rep(NA_character_, 6L),
  comments = rep(NA_character_, 6L)
)

Is there any way I can plot a family tree with ggplot? If not, how can i plot it using another package.

The primary key is 'id' and you connect to other members of the family using "id_father" and "id_mother".

moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
Alex Burdusel
  • 3,015
  • 5
  • 38
  • 49
  • 4
    What have you tried? Do you know how to use R and its graphics? Have you looked at using igraph to represent your data and its graphics methods? A family tree is a type of graph, so igraph and the whole Graphical Models Task View would be a good place to start. Have you read that yet? – Spacedman Jul 01 '12 at 11:43
  • Perhaps you can modify plots in [ggphylo](https://github.com/gjuggler/ggphylo), an extension to ggplot2 to make phylogenetic tree plots. – sckott Jul 01 '12 at 16:56

2 Answers2

11

As noted in the comments, you should try igraph. Here is a quick start:

require(igraph)
mothers=familyTree[,c('id','id_mother','first_name', 'last_name')]
fathers=familyTree[,c('id','id_father','first_name', 'last_name')]
mothers$name=paste(mothers$first_name,mothers$last_name)
fathers$name=paste(fathers$first_name,fathers$last_name)
names(mothers)=c('parent','id','first_name','last_name','name')
names(fathers)=c('parent','id','first_name','last_name','name')
links=rbind(mothers,fathers)
links=links[!is.na(links$id),]
g=graph.data.frame(links)
co=layout.reingold.tilford(g, flip.y=F)
plot(g,layout=co)

enter image description here

There aren't any names, and the arrows are going in the wrong direction, but you should be able to go from there.

Community
  • 1
  • 1
nograpes
  • 18,623
  • 1
  • 44
  • 67
  • 1
    Thanks nograpes. I looked over igraph but couldn't figure out where to start from. Your example is a good start. – Alex Burdusel Jul 02 '12 at 07:35
  • I have another question. I managed to turn the tree 180 degrees and place the names instead of the numbers on the vertices. The problem is that the tree becomes too crowded. Is there any way i cam specify the layout to place more spaces between the vertices on the x axis? – Alex Burdusel Jul 02 '12 at 12:52
  • To be honest I only learned of `igraph` a few days ago. They do have some nice documentation (http://igraph.sourceforge.net/) on their website, I'm sure it is possible. – nograpes Jul 02 '12 at 13:17
  • i looked over the manual, but it looks to me that layout.reingold.tilford() doesn't have any paramether for that – Alex Burdusel Jul 02 '12 at 13:55
  • 1
    This sounds like a new question. You may want to check out other questions that are tagged [tag:igraph] as well. – Andy Clifton Aug 15 '13 at 18:45
9

Have you tried the kinship2 package?

library(kinship2)
df <- data.frame(id = c(1,2,3,4,5,6), sex = c(1,2,1,2,2,2), dadid = c(0,0,0,0,1,3), momid = c(0,0,0,0,2,4), famid = 1)
relation1 <- matrix(c(2,3,4,1), nrow = 1)
foo <- pedigree(id = df$id, dadid = df$dadid, momid = df$momid, sex = df$sex, relation = relation1, famid = df$famid)
ped <- foo['1']
plot(ped)

You can see the resulting plot:

enter image description here

zx8754
  • 52,746
  • 12
  • 114
  • 209
InspectorSands
  • 2,859
  • 1
  • 18
  • 33