3

I need some suggestions from the database design experts here.

  1. I have around six foreign keys into a single table (defect) which all point to primary key in user table. It is like:

    defect (.....,assigned_to,created_by,updated_by,closed_by...)
    

    If I want to get information about the defect I can make six joins. Do we have any better way to do it?

  2. Another one is I have a states table which can store one of the user-defined set of values. I have defect table and task table and I want both of these tables to share the common state table (New, In Progress etc.). So I created:

    • task (.....,state_id,imp_id,.....)
    • defect(.....,state_id,imp_id,...)
    • state(state_id,state_name,...)
    • importance(imp_id,imp_name,...)

There are many such common attributes along with state like importance(normal, urgent etc), priority etc. And for all of them I want to use same table. I am keeping one flag in each of the tables to differentiate task and defect. What is the best solution in such a case?

If somebody is using this application in health domain, they would like to assign different types, states, importances for their defect or tasks. Moreover when a user selects any project I want to display all the types,states etc under configuration parameters section.

akhil_mittal
  • 23,309
  • 7
  • 96
  • 95
  • What's the role of `type` and `type_id` in all this? Is the type a property of the state or is it a property of the state "instance" in the task or defect? – Branko Dimitrijevic Jun 28 '12 at 09:33
  • Type basically refers to importance task can have (normal, urgent, critical) etc and type_id is primary key for that table. Now a task and a defect both will have type (or better call it importance) ans state (new, in progress etc) – akhil_mittal Jun 29 '12 at 03:07

2 Answers2

6

1 ...

There is nothing inherently wrong with this. If the possible user "roles" are strictly determined and unlikely to change, then this is, in fact, the preferred way to do it. You are effectively modeling the C:M relationship, where C is constant.

On the other hand, if the roles can change (e.g. you need to be able to dynamically add a new "lifecycle phase" to the defect), then a link (aka. junction) table, modelling the true M:N relationship, might be justified. Something like this:

enter image description here

BTW, while JOINs have their cost, that by itself doesn't mean you can't afford it. And you may not even need all the JOINs all the time - just do the JOINs that bring the information that you currently need and leave the others out. In any case, I recommend you measure on realistic amounts of data to determine if there is an actual performance problem.

2 ...

If there are many common fields, you can use inheritance1 to minimize the repetition of common fields and foreign keys. For example:

enter image description here

In this model, each "attribute" table (such as state and importance) is connected only once with the base, and through it to each of the "inherited" tables such as defect and task. No matter how many "inherited" tables you add, you'll still have just one connection per "attribute" table

This model essentially prevents the proliferation of "attribute" FKs, for the price of somewhat more cumbersome management of defects and tasks (since they are now split to "common" and "specific" portion). Whether this is a better balance than the old design is for you to decide...


1 Aka. category, subclass, generalization hierarchy etc...

Branko Dimitrijevic
  • 50,809
  • 10
  • 93
  • 167
  • 1
    Thanks Fish. Branko you are awesome :) I am using exactly same table structure, so I am quite confident now that I am going in right direction. This is my first attempt of creating end-to-end application, so was little confused about table designs. – akhil_mittal Jun 30 '12 at 04:40
1

For the first issue it seems to me like you might be better off decomposing the defects table to have a "line" table, that stores the history of the transactions on it, like being created, updated, assigned etc.

This has the benefit of being more flexible. A defect can more easily track the history of being updated and assigned to multiple people, being reopened etc., with multiple updates through time being clearly denoted.

You'd still have your defect table storing some general global state for the defect (perhaps things like status, priority etc., especially if you don't need to track the changes in these states), and you'll have a series of *defect_lines*, each with a foreign key pointing to the defect, and another foreign key pointing to the user, and a type field indicating whether it was "create", "update" etc.

Eg.

defect ( id, priority, status )
defect_line ( defect_id, timestamp, type, user, comment )

Depending on the assumptions you're fine with making, defect_id, timestamp, and user could be the primary key for a defect_line.


For the second issue, at least with the information you've given it seems that they share all the same properties - so are basically the same thing with a different type flag ("task" or "defect"), so why not just have them in the same table?

I'd probably call this task under the assumption that all defects can be viewed as tasks ("things to be fixed"), but all tasks are not defects - but this is all just semantics now.

If there are distinct, non-shared properties though, you might then get to the case that you could have the shared properties of a task/defect in a table, and tables for the properties specific to each, though they might be able to still share the same "line" detail structure. This then partly becomes a decision between normalisation and ease of use - it might be easier to have two tables that largely have the same properties instead of braking this down into multiple tables.

Fish
  • 459
  • 2
  • 6