6

I need to save basic member's data with additional attributes such as gender, education, profession, marital_status, height, residency_status etc.

I have around 15-18 lookup tables all having (id, name, value), all attributes have string values.

Shall I create member's table tbl_members and separate 15-18 lookup tables for each of the above attributes:

tbl_members:

mem_Id
mem_email
mem_password
Gender_Id
education_Id
profession_id
marital_status_Id
height_Id
residency_status_Id

or shall I create only one lookup table tbl_Attributes and tbl_Attribute_Types?

tbl_Attributes:

att_Id
att_Value
att_Type_Id

Example data:

001 - Male - 001
002 - Female - 001
003 - Graduate - 002
004 - Masters - 002
005 - Engineer - 003
006 - Designer - 003

tbl_Attribute_Types:

att_type_Id
att_type_name

Example data:

001 - Gender
002 - Education
003 - Profession

To fill look-up drop-downs I can select something like:

SELECT A.att_id, A.att_value, AT.att_type_name
FROM tbl_Attributes A
INNER JOIN tbl_Attribute_Types AT ON AT.att_type_Id = A.att_type_Id
WHERE att_Type_Id = @att_Type_Id

and an additional table tbl_mem_att_value to save member's attributes and values

tbl_mem_att_value:

mem_id
att_id

Example data for member_id 001, is Male, Masters, Engineer

001 - 001
001 - 004
001 - 005

So my question is shall I go for one lookup table or many lookup tables?

Thanks

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Geo Concepts
  • 177
  • 3
  • 13
  • 1
    height_id? Are you really going to enumerate every possible height value in a lookup table? – Joe Stefanelli Dec 19 '14 at 21:04
  • One lookup table is better for people to understand the database design. Usually, the lookup table will fit in a database page or two, so most selects are from memory rather than disk. – Gilbert Le Blanc Dec 19 '14 at 21:06
  • yes, as there won't be too many heigh_ids - what do you suggest? – Geo Concepts Dec 19 '14 at 21:07
  • what would a select query to find members Male, Masters, Engineer? – Geo Concepts Dec 19 '14 at 21:09
  • I think the issue with height id is that the measurement, be it in inches or centimeters, could be trivially, readably and efficiently represented as an integer. As for one look up table or many - it is largely a matter of preference. Do you prefer to maintain one table or many? Is the number of enumerated attributes ever going to get to a point where it could really effect query time? Are all of the values comfortably represented as the same data type, IE, strings? Are you going to need to enforce correct input into these fields, to prevent a user from saying that their gender is Masters? – Politank-Z Dec 19 '14 at 21:19
  • Politank-Z there will be set numbers of heigh options available to choose from a drop-down-list, user won't be able to type / keyin - he/she will has to select an option from drop-down-lists to enforce correct inputs, say to prevent a user from saying that their gender is Masters - – Geo Concepts Dec 19 '14 at 21:31
  • can anyone suggest a select query to get member's details including all of the attribute values? – Geo Concepts Dec 19 '14 at 21:33
  • Since this is clearly a work in progress I would recommend dropping the tbl_ prefix. It does nothing except make things harder to work with. – Sean Lange Dec 19 '14 at 21:36
  • 1
    This is an anti-pattern and it even has a name: "one true lookup table". – Branko Dimitrijevic Dec 19 '14 at 23:23
  • I have around 15-18 lookup tables all having (id, name, value), all having string values. – Geo Concepts Dec 19 '14 at 23:36
  • 1
    @BrankoDimitrijevic is correct; this is a well known anti-pattern in database design, also known as MUCK (Massively Unified Code-Key) tables. https://www.red-gate.com/simple-talk/sql/database-administration/five-simple-database-design-errors-you-should-avoid/ – David Keaveny Dec 17 '19 at 23:16
  • I work with a database that is rather designed around one of these MUCK tables; it has everything in there, and then uses views to represent the "virtual" lookup tables, and of course to cast the values to the correct data type, since in the MUCK table the lookup values are all stored as VARCHAR(100). – David Keaveny Dec 17 '19 at 23:18
  • This question has been discussed [here](https://stackoverflow.com/questions/9380765/one-true-lookup-table-advantages-vs-disadvantages). – Walter Mitty Aug 07 '20 at 18:47

4 Answers4

11

Never use one lookup table for everything. It will make it more difficult to find things, and it will need to be joined in every query probably multiple times which will mean that it may cause locking and blocking problems. Further in one table you can't use good design to make sure the data type for the descriptor is correct. For instance suppose you wanted a lookup of the state abbreviations which are two characters. If you use a onesize fits all table, then it has to be wide enough for teh largest possible value of any lookup and you lose the possibility of it rejecting an incorrect entry because it is too long. This is a guarantee of later data integrity issues.
Further you can't properly use foreign keys to make sure data entry is limited only to the correct values. This will also cause data integrity issues.

There is NO BENEFIT whatsoever to using one table except a few minutes of dev time (possibly the least important concern in designing a database). There are plenty of negatives.

HLGEM
  • 94,695
  • 15
  • 113
  • 186
  • Hi HLGEM, all attribute tables would have same columns id INT, name VARCHAR(100) and value VARCHAR(100). Users will only be allowed to select a value from a drop-down-lists to enforce correct inputs, so in that case would you recommend using ONE LOOKUP TABLE or many lookup tables? – Geo Concepts Dec 19 '14 at 21:39
  • Don't you need to at some point pull those values back out so you determine what they actually mean? That is when you start joining to the same table over and over in a single query. – Sean Lange Dec 19 '14 at 21:47
  • Let put this as strongly as I can, there is no circumstance in which I would use one lookup table. It is a horrible idea. There is no advantager to it and lots of disadvantages. – HLGEM Dec 19 '14 at 23:10
  • I have around 15-18 lookup tables all having (id, name, value), all having string values. – Geo Concepts Dec 19 '14 at 23:36
  • See this discussion. https://stackoverflow.com/questions/9380765/one-true-lookup-table-advantages-vs-disadvantages – Walter Mitty Aug 07 '20 at 18:48
4

The primary reason for using multiple lookup tables is that you can then enforce foreign key constraints. This is quite important for maintaining relational integrity.

The primary reason for using a single lookup table is so you have all the string values in one place. This can be very useful for internationalization of the software.

In general, I would go with separate reference tables, because relational integrity is generally a more important concern than internationalization.

There are secondary considerations. Many different reference tables are going to occupy more space than a single reference table -- with most of the pages being empty (how much space do you really need to store the gender lookup information?). However, with a relatively small number of reference tables, this is actually a pretty minor concern.

Another consideration in using a single table is that all the reference keys will have different values. This is useful because it can prevent unlikely joins. However, I prevent this problem by naming join keys the same, both for the primary key and the foreign key. So, GenderId would be the primary key in Gender as well as the foreign key column.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • what would be a select query to get member's details including all of its attribute names and values in both cases? case-1 MANY lookup tables and case-2 ONE lookup table – Geo Concepts Dec 19 '14 at 21:52
  • 2
    @GeoConcepts Whether you'll search multiple lookup tables once each, or one lookup table multiple times doesn't make much of a difference. – Branko Dimitrijevic Dec 19 '14 at 23:21
  • @GeoConcepts . . . Branko is totally correct. The SQL queries would look very, very similar except in one case, the reference table names would all be the same and in another case they would all be different. – Gordon Linoff Dec 20 '14 at 11:49
2

I've struggled with the same question myself. If the only thing in the lookup table is some sort of code or id and a text value, then it certainly works to just add "attribute id" and throw it all in one table. The obvious advantage is that you then have only one table to create and manage. Searches might possibly be slower because there are more records to search, but presumably you create an index on attribute id + value id. At that point, whether performance is better having one big table or ten small tables probably depends on all sorts of details about how the database engine works and the pattern of access. That's a case where I'd say, Unless in practice it proves to be a problem, don't worry about it.

Two caveats:

One: If you do create a single table, I'd create a code for the attribute name, and then another table to list the codes. Like:

lookup_attribute(attribute_id, attribute_name)
lookup_value(attribute_id, value_id, value_text)

Then the first table has records like

1, 'Gender'
2, 'Marital Status'
3, 'Education'
etc

And the second is

1, 1, 'Male'
1, 2, 'Female'
1, 3, 'Undecided'
2, 1, 'Single'
2, 2, 'Married'
2, 3, 'Divorced'
2, 4, 'Widowed'
3, 1, 'High School'
3, 2, 'Associates'
3, 3, 'Bachelors'
3, 4, 'Masters'
3, 5, 'Doctorate'
3, 6, 'Other'
etc.

(The value id could be unique for all attribute ids or it might only be unique within the attribute id, whatever works for you. It shouldn't matter.)

Two: If there is other data you need to store for some attribute besides just the text of a value, then break that out into a separate table. Like if you had an attribute for, say, "Membership Level", and then the user says that there are different dues for each level and you need to record this, then you have an extra field that applies only to this one attribute. At that point it should become its own table. I've seen systems where they have a couple of pieces of extra data for each of several attributes, and they create a field called "extra data" or some such, and for "membership level" it holds annual dues and for "store name" it holds the city where the store is and for "item number" it holds the number of units on hand of that item, etc, and the system quickly becomes a nightmare to manage.

Update

To retrieve values, let's suppose we have just gender and marital status as lookups. The principle is the same for any others.

So we have the monster lookup table as described above. Then we have the member table with, say

member (member_id, name, member_number, whatever, gender_id, marital_status_id)

To retrieve you just write

select m.member_id, m.name, m.member_number, m.whatever,
  g.value_text as gender, ms.value_text as marital_status
from member m
join lookup_value g on g.attribute_id=1 and g.attribute_value=m.gender_id
join lookup_value ms on ms.attribute_id=2 and ms.attribute_value=m.marital_status_id
where m.member_id=@member_id

You could, alternatively, have:

member (member_id, name, member_number, whatever)
member_attributes (member_id, attribute_id, value_id)

Then you can get all the attributes w

select a.attribute_name, v.value_text
from member_attribute ma
join lookup_attribute a on a.attribute_id=ma.attribute_id
join lookup_value v on v.attribute_id=a.attribute_id and v.value_id=ma.value_id
where ma.member_id=@member_id

It occurs to me as I try to write the queries that there's a distinct advantage to making the value id globally unique: Not only does that eliminate having to specify the attribute id in the join, but it also means that if you do have a field for, say, gender_id, you can still have a foreign key clause on it.

Jay
  • 26,876
  • 10
  • 61
  • 112
  • what would be a select query to get member's details including all of its attribute names and values? – Geo Concepts Dec 19 '14 at 21:46
  • and to get list of all members including all of its attribute names and values? – Geo Concepts Dec 19 '14 at 23:21
  • Just instead of "where member_id=@id" put whatever selection criteria you want. Of course you'd then need to include the member id or something to identify the member in the SELECT. To get all members, put no WHERE clause at all. If you have a separate member_attribute table and you don't know at coding time what the attributes are, then you'll have to do a query like the above that generates one output row per attribute, and your code would have to examine the member id as it processed records. (There's no way in ANSI SQL to combine multiple unknown rows like that into a single row, ... – Jay Dec 29 '14 at 14:59
  • ... Some SQL dialects provide features to do this, some of which work smoothly and some don't.) – Jay Dec 29 '14 at 14:59
0

Putting all the lookup values into a single table is usually referred to as Common Lookup Tables, or Massively Unified Code-Key (MUCK), and is generally considered a design error.

Great argumentation of why it's not a good idea can be found in the article below.

https://www.red-gate.com/simple-talk/sql/database-administration/five-simple-database-design-errors-you-should-avoid/

FFFffff
  • 776
  • 7
  • 18