5

I have a very simple question concerning 2 alternatives from which I do not know which one to choose. I have entities which can be "contacts". A 'contact' can have multiple e-mail adresses, multiple phone numbers and multiple addresses. In my data model I make the entity contact, which has a 1 to n realtion to entities email, phone and address. Now the table email has just the fields "email" and "comments" and the phone has the similar structure "phoneNumber" and "comments".

enter image description here

Is it 'better' to keep those in 2 different tables or should I make one table, lets name it 'detail' or whatever, with columns 'value', 'type' and 'comments' with type being e.g. 'email' or 'phoneNumber'.

enter image description here

I am using L4 and with the Eloquent Models I expect it to be easily possible to write some methods that give me the same functionality as with different tables. But I feel like having strings to separate between different types of information is wrong. Feels like it is easier to make mistakes then. With eager loading I expect not to have significantly more queries even if I have 2 tables. FYI the number of rows in phone/email will be definitely below 10.000.

Does it make a difference what type I choose? What would you do? And why?

Thanks for helping,

Kind regards

Matthias S
  • 3,358
  • 3
  • 21
  • 31
  • You have entered into something of a religious war. The way to understand the problem is how the data will be used. Are you always looking for a "contact" when linking to other tables? Or, do you something want to link to specific types of contacts? – Gordon Linoff Jan 15 '14 at 13:10
  • *Favorited for interest.* – ʰᵈˑ Jan 15 '14 at 13:11
  • "Better" is a subjective term. Generally you're trading a level of normalization for 'ease of use'. Most people have a business address, a home address, and three contact numbers - home, work (direct dial), and mobile. It's 'normal' (as opposed to 'normalized') to store all these in one table. – Strawberry Jan 15 '14 at 13:11
  • The problem is that I have to provide the possibility to store 'endless' phone numbers. Therefore I can not make assumptions about how many phone numbers there are and put it inside the address table. In this specific case I have 'business' and 'person', for both I need to be able to store different, labeled addresses, different phone numbers and email adresses. – Matthias S Jan 15 '14 at 13:16
  • "Most people have three phone numbers." Maybe so, but if you put three phone numbers in one record, then what do you do when someone comes along with four phone numbers? I had a time when I had two cell phones, my work cell phone and my personal cell phone. Why create a potential headache when there's an easy solution? I'm reminded of an insurance system I worked on where rather than have a separate record for each family member, they had one big record with husband, wife, and slots for six children. I was called in when they got a customer who had eight children. Did no one see that coming? – Jay Jan 15 '14 at 17:00

2 Answers2

3

It depends on just what you plan to do with these fields. My inclination in the absence of further information would be to put them in one table.

You will have some purists who say that has an email address is not the same thing as a phone number, therefore it is an offense against nature to use the same field to hold both.

I agree with that type of thinking when what you do with the two fields is different. Like if someone said that sometimes this field contains the phone number and other times it contains the transaction amount, and when we are calculating total balance we add up all the ones that are transaction amounts but ignore the ones that are phone numbers, I would cry.

But that kind of reasoning can be taken to ridiculous extremes. The question is really not whether two things are different in the real world out there, but whether they are different for purposes of our system. Like I can't imagine saying that because sometimes an address is a post office box and sometimes it's a street address and sometimes it's an apartment number, etc, that we must have separate fields for each of these things rather than just "address line 1" and "address line 2". Or that we must have separate fields for the "brown-haired person's name versus "blonde-haired person's name", because hey, they look different. I have had many frustrating conversations with users where I try to explain that as far as the system is concerned, everything is a "product", and the user says no, no, how can you possibly say that furniture is the same as stationery? But if what we do in the system is record the name, quantity on hand, and price, than I don't care about the difference. Etc etc.

Indeed, some of my Proudest Moments In Software Development were when I realized that two things that are different in the real world are really the same to the system and can be handled with one table or one block of code instead of many. Like somewhere along the line I figured out that employees, vendors, and customers all have names, addresses, phone numbers, and emails. So now rather than having all these fields in the employee table and all the same fields in the vendor table and all the same fields again in the customer table, I create one field that I call "person" and put all the common stuff in there, and then just link to it from the other tables. So when someone comes along and says that now we have to handle foreign addresses, I change one table instead of three and if I was smart one address-formatting function instead of 3.

In this case, what are you going to do with phone numbers and email addresses? Probably mostly let users type them in and then later display them. I can readily envision a system where you would never even check which is which. At data entry time there's a dropdown for the "type of contact information", and at display time you display the type of contact along with the contact value, probably sorted by type of contact. If you are sending automated emails, maybe you select where type='email'.

Now if you are doing joins from this table to another table using the email address as the join field, that would be different, because then half your data doesn't make sense.

BTW if you do use one table, you need a code to say what type of contact this is. I think you realize that. I suggest you consider creating a look-up table that contains the codes and their definitions, like create table contact_type (contact_type_code char(2) primary key, contact_type_description varchar(40)), rather than hard-coding the contact types in the program. Or worse, putting the description of the contact type in each record, so sometimes it says "e-mail" and other times "email" and other times "E-mail" and probably other times "e-male" or "internet".

Sorry for the long rambling answer.

Jay
  • 26,876
  • 10
  • 61
  • 112
  • Thanks for this advice. My application is rather small but the requirements are vaguely explained and seem to change at the customers will. I think that a single table is easier to maintain – Matthias S Jan 22 '14 at 13:43
0

This is how I typically design tables for addresses, contacts and e-mails:

Example address types might be "Home", "Mailing", "Work", etc.

address_types
    id                  varchar(15)(P)

The reason city_id and county_id can both be NULL is because in Virginia an address is either within a city or a county - but not both. So at the application level I enforce that at least one of these two fields must not be NULL.

addresses
    id                  unsigned int(P)
    address_type_id     varchar(15)(F address_types.id)
    line1               varchar(50)
    line2               varchar(50)
    city_id             unsigned int(F cities.id) Default NULL
    county_id           unsigned int(F counties.id) Default NULL
    zip                 varchar(6)
    zip4                char(4) Default NULL
    lat                 decimal(10,8) // Provides for accuracy to ~1mm. Default to NULL     lon                     decimal(11,8) // Provides for accuracy to ~1mm. Default to NULL

You may or may not want things like fips_number and the columns listed after it. They are codes that were used by the US government.

cities
    id                      unsigned int(P)
    state_id                unsigned int(F states.id)
    name                    varchar(50)
    lat                     decimal(10,8) // Provides for accuracy to ~1mm. Default to NULL
    lon                     decimal(11,8) // Provides for accuracy to ~1mm. Default to NULL
    fips_number             unsigned int // Default NULL
    census_code             unsigned int // Default NULL
    census_class_code       char(2)  // Default NULL
    gsa_code                unsigned int // Default NULL
    opm_code                unsigned int // Default NULL

There's a many-to-many relationship between cities and counties so we define those here. Each column is a foreign key to it's respective table and together they form the primary key.

cities_counties
    city_id             unsigned int \_ (P) (F cities.id)
    county_id           unsigned int /      (F counties.id)

Example contact types might be "Home", "Work", "Pager", "Cell", etc.

contact_types
    id                  varchar(15)(P)

You may or may not want to break up the phone numbers into their constituent parts. In this example they are broken up. Handy way to query for numbers in a particular country code, area code, etc.

contacts
    id                  unsigned int(P)
    contact_type_id     varchar(15)(F contact_types.id)
    country_code        char(3) // Default to NULL
    area_code           char(3)
    exchange            char(3)
    station             char(4)
    extension           varchar(10) // Default to NULL

This would hold all the counties, parishes and other similar geographical areas in US states and territories.

counties
    id                      unsigned int(P)
    state_id                unsigned int(F states.id)
    name                    varchar(50)
    fips_number             unsigned int // Default NULL

Take a look at ISO 3166-1 for more information about country codes.

countries
    id                      char(2)(P)
    iso3                    char(3)(U)
    iso_num                 char(3)(U)
    name                    varchar(44)(U)

Example contact types might be "Home", "Work", etc.

email_types
    id                  varchar(15)(P)

E-mail addresses - you may or may not want to split them up like in my example.

emails
    id              unsigned int(P)
    email_type_id   varchar(15)
    mailbox         varchar(255)
    domain          varchar(255)

Take a look at ISO 3166-2 for more information about state codes.

states
    id                      unsigned int(P)
    country_id              char(2)(F countries.id)
    code                    varchar(3)
    name                    varchar(45)
    fips_number             unsigned int // Default NULL

And then of course you tie them to an entity:

customer_addresses
    id              unsigned int(P)
    customer_id     unsigned int(F customers.id)
    address_id      unsigned int(F addresses.id)

customer_contacts
    id              unsigned int(P)
    customer_id     unsigned int(F customers.id)
    contact_id      unsigned int(F contacts.id)

customer_emails
    id              unsigned int(P)
    customer_id     unsigned int(F customers.id)
    email_id        unsigned int(F emails.id)
Benny Hill
  • 6,191
  • 4
  • 39
  • 59
  • This is way more complexity than most systems I've worked on need. Like when do I care what county a city is in? Or what the "census code" for a city is? If you're in the shipping business or doing demographic research, maybe this is all valuable, in which case, cool. But for most systems, it's a lot simpler to just have, e.g. a free-form text field for city name. Building in this kind of complexity if you don't need it just gets you into trouble when you hit cases that don't fit the mold. Like non-US phone numbers are not always broken into groups of 3, 3, and 4 digits. – Jay Jan 15 '14 at 17:11
  • @Jay - actually, this design comes from spending over 15 years in shipping. All north American phone numbers fit the pattern 999-999-9999, just like all zip codes are 5 digits (US and MX) or 6 characters (CA). The answer though, like many answers pertaining to database design, is really meant as a starting point. We have no idea what Matthias' business needs are, the best we can do is give him some tried and true advice. – Benny Hill Jan 15 '14 at 18:06
  • Note I said, "if you're in the shipping business ...". So for you this design might well make a lot of sense. But for most systems it is a lot of unnecessary complexity. And yes, all North American phone numbers are 999-999-9999, but phone numbers on other continents are not. I wasn't trying to say that your design is stupid or anything like that, just that I think you have to take it in context, and you have to understand the context. – Jay Jan 15 '14 at 19:41
  • Thanks for sharing your experience. This is way to detailed for my application, for sure. But it shows me what approach you took - different tables for different data. I think I will stick to one table though (http://martinfowler.com/eaaCatalog/singleTableInheritance.html). Finally found out that this is called Single Table Inheritance. – Matthias S Jan 22 '14 at 13:07