4

Here's my situation. I am building a database to track relationships between people and households. Typically, everything is tied to a "head of household".

I am trying to avoid this as it creates problems when people move (i.e. brother finally gets a job and moves out) or families break up (i.e. dad and mom get divorced, household is now split into two, some kids stay with mom, some with dad).

It is a huge challenge to figure out how to cascade the data when the model is based on HOH. My approach is split the data into 3 tables, household, person, and relationship. hh just stores an address and an id. relationship stores a person_id_a, person_id_b and a relationship code (i.e. 1 = sibling). And person stores the names, and a hh_id. This way, if a family splits up, I can change the relationship between mom and dad to NULL and create a new hh for dad and any kids that came with him without disrupting the other relationships.![Here is what the model looks like:

Person -person_id -hh_id

Relationship -person_id_a -person_id_b -relationship_cd

Household -hh_id -address

Does this makes sense to you guys? Can you think of any reason this wouldn't work, or think there is a better model out there?

Sorry, I know this is a bit depressing. All these divorces making my job difficult >:(

Taryn
  • 242,637
  • 56
  • 362
  • 405
imfm
  • 119
  • 7
  • Cool concept! May I ask what you are building this for and what database software you are using? (btw last line made me giggle) – nivix zixer May 08 '15 at 16:03
  • You may want to consider storing names in a separate table where one person can have multiple names. Regarding households, can your model store the situation of a person having multiple households, since that's not uncommon (second homes, split custody, etc.)? – hatchet - done with SOverflow May 08 '15 at 16:03
  • @nivixzixer work for a non-profit. We track services we provide to individuals and families and we are transitioning to a new database and reporting system. I don't know exactly what software we will be using. We have a vendor who will actually build it. – imfm May 08 '15 at 16:13
  • @hatchet That is an interesting point. No, in this model each person has only one hh_id at a time, or possibly none if they are homeless. Any ideas on how to accommodate those situations? – imfm May 08 '15 at 16:16
  • @Ian - an intersect table between Person and Household to create a many to many relationship. A household can have many (zero or more) people, and a person can have many (zero or more) households. A flag in the intersect record could indicate the primary household of a person if that is needed. – hatchet - done with SOverflow May 08 '15 at 16:39
  • As far as names, I am thinking we will use first name, last name (maiden name for women), middle initial, and birth date to ensure each person has a unique id. This may result in the occasional duplicate, but hopefully data entry staff will be vigilant about this. Anything that can change such as married name, nickname, marital status will be stored in a person_detail table with a 1:n relationship to the person table. – imfm May 08 '15 at 16:51
  • @hatchet Excellent point. I will definitely discuss that with my team. Is there a way to "up vote" comments? I really appreciate the input. – imfm May 08 '15 at 16:55
  • Highly recommend against middle name column. It's an invitation to incorrectly enter names from different cultures, particularly dual Hispanic surnames by people not familiar with them. For instance, Juan Gomez Portillo will invariably get Gomez (the paternal surname) stuck in the middle name column (which is really wrong). Better to have just two name columns: GivenNames, and FamilyNames. Married names and nicknames should be additional names in the PersonName table, not shoved off into a peripheral table. It will make it much easier to deal with in a name search algorithm. – hatchet - done with SOverflow May 08 '15 at 17:00
  • @hatchet Another good point. The trick is going to be finding the way to handle names that 1) fits the reality of naming conventions across cultures and 2) is simple enough not to confuse our clients and/or staff when collecting/inputting the data. Maybe splitting it up into PaternalLastName, MaternalLastName, MarriedLastName? This could help with Hispanic naming conventions also help to deal with people who have hyphenated last names. – imfm May 08 '15 at 17:25
  • LOL. A moderator deleted my praise for this site and its extremely helpful users from my question. I appreciate that you want us to get the point, but lets leave some room for praise and thanks too! – imfm May 08 '15 at 17:27

1 Answers1

2

For names, I recommend keeping it simple. Have a name table with three columns:

  • GivenNames
  • FamilyNames
  • NameType {legal, alternate}

You can even just go with a single name column for FullName instead. I would not have middle name, married name, maiden name, paternal surname, nickname, or any other "special" name columns. It will only complicate any name searching algorithm, and confuse data entry. Here are some example names to consider:

  • John Paul Smith
  • Mary Paul Smith
  • John Henry William Artemis Williams
  • Maria de los Angeles Gomez Portillo
  • Abdul Rahmin ibn Saeed ibn Abd al-Aziz al-Filasteeni
  • Abdulla

So what do these names show? The first is pretty normal english name, with first (John), middle (Paul) and last (Smith). The second is Mary. She has no middle name in her legal name. Her maiden surname is Paul, and her married surname is Smith. She uses both, without a hyphen. The third is John. His parents thought it would be awesome to give him three middle names. The fourth is a Hispanic name, and that's her full legal name. Her given name is Maria de los Angeles. "de los Angeles" is part of her fist name. She does not have a middle name. Her father's first surname was Gomez. Her mother's first surname was Portillo. So Maria's full surname is Gomez Portillo. She may commonly go by just Maria Gomez because she's so tired of her name getting entered in a mangled way. Technically, Portillo is her "last" name, but if she were to use only one of her two surnames, she would use the first one (the paternal surname). The fifth is Abdul Rahmin, son of Saeed, grandson of Abd al-Aziz, of Palestine. Have fun putting this in first/middle/last. Abdul Rahmin is his given name. All the rest are family names. The final one is Abdulla. He's from Afghanistan. It's not uncommon for folks to have no last name there. He is simply Abdulla. He also doesn't know his birthdate, because where he was from, they don't care about that (this was often the case in American colonial times...caring about birthdates is kind of a recent thing in many cultures).

You should have a separate name table, so a person can have more than one name row. An example illustrates one of the many, many reasons why. Mary Smith hates her common name. So she gets it legally changed to Sunshine Lollipop Countess of the Universe. She sort of regrets that two years later. Now, she sometimes uses Mary Smith, and sometimes Sunshine Universe, depending on the phase of the moon. Store both of them and you find her either way.

However you do it, people will mangle names in every way imaginable (and some you wouldn't imagine), and enter them differently at different times. If you don't accept that as a given, you'll have problems.

For households, as I mentioned in my comments, you may want to support a many-to-many relationship. A household can have zero to many people, and a person can have zero to many households. This is a little tricky though, because it depends on what your define as a household. You may feel the need to denote one as primary, but consider a child that spends half her time with her mother, and half with her father. Which one is primary?

  • Very thoughtful and funny response. Thanks! – imfm May 08 '15 at 18:32
  • How would you suggest handling Head of Household? Assume each household can have zero or one of them, and a person can only be a HoH in one household. Since people move often, this would need business logic to handle the reassignments. What would be a model that would require the least business logic on changes? – shanemgrey Jul 21 '17 at 20:49