I'm trying to filter a data.frame with family information. It looks like this:
+--------+-------+---------+
| name | dad | mom |
+--------+-------+---------+
| john | bert | ernie |
| quincy | adam | eve |
| anna | david | goliath |
| daniel | bert | ernie |
| sandra | adam | linda |
+--------+-------+---------+
Now I want to know if every person who has the same dad, also has the same mom. I've been over this for an hour now trying different approaches, but i keep getting stuck. Also, i'd like to use an R-approach and not a long sequence of functions or for-loops that technically does what i want, without learning anything new.
My expected output:
+--------+------+-------+
| name | dad | mom |
+--------+------+-------+
| quincy | adam | eve |
| sandra | adam | linda |
+--------+------+-------+
Essentially I want to have a data.frame with dads and moms who have kids from multiple partners.
So far my approach has been:
- split the df by the father column
- from the resulting list of dfs, remove all dfs with only one row (here i already get stuck, cant make it work)
- remove all dfs where nrow(unique(df$mom)) = 1
- the resulting list should give me all siblings with different parents.
My code up to now:
fraternals <- split(kinship, kinship$father)
fraternals <- fraternals[-which(lapply(fraternals, function(x) if(nrow(x) == 1) { output TRUE }))]
but that doesn't run because r says i can not use TRUE in that way.