5

Short introduction – logic behind the data

I am working on a PostgreSQL database that stores experimental data (Raman spectra) and their metadata. It is necessary, in addition to the "normal" spectra of our samples, to acquire and save special spectra for the instrument calibration purposes.

Each time I do an experiment I create a new record in the table measurements, where I specify the measurement type and the corresponding metadata (foreign keys to other tables).

The issue is, that the metadata type differs depending on the measurement type:

  1. Sample measurement – we have to store sample_id from the table samples
  2. Standard substances – in this case we specify a substance from the table substances
  3. Special standardized light source – source_id has to be provided
  4. Dark frames – simple, no data from other tables are necessary

samples, substances and light sources have completely different properties, so we have to keep them in separate tables, linked to the table measurements via foreign keys. I think my data have a polymorphic relation here (correct me if I'm wrong).

The problem

In the current design, the value stored in column measurementtype_id (highlighted red) defines which foreign keys (highlighted yellow) have to be NULLed and which have to contain a value.

Table relationships

enter image description here

I am pretty sure that this design violates the third normal form. We can have a scenario when metadata of a wrong type is associated with the measurement. How can I ensure the data integrity? Unfortunately, I was not able to find a reasonable solution so far...

philipxy
  • 14,867
  • 6
  • 39
  • 83
R Kiselev
  • 1,124
  • 1
  • 9
  • 18
  • A NF is something that a table in isolation satisfies or doesn't. And normalization doesn't deal with NULLs. So it's not clear what your concerns are re 3NF. PS Search/google re SQL subtyping/polymorphism. – philipxy Mar 10 '16 at 02:22
  • What is your DBMS? The constraints that you can write declaratively depend on that. – philipxy Mar 10 '16 at 02:30
  • Thank you very much @philipxy! Your search keywords helped me to find the relevant info, and I think that I understood the topic. After some research and testing I came to a solution similar to [this one](http://stackoverflow.com/a/562030/6029703). It seems to work well and does what I want. I also tested PostgreSQL inheritance, but it did not work for me due to [issues with foreign keys](http://stackoverflow.com/a/26035311/6029703). – R Kiselev Mar 10 '16 at 10:07

1 Answers1

1

With PostgreSQL you can have check constraints. Now, I think your big problem is that maintenance over time is likely to be a problem if you have to add more types.

However, you can fairly easily do something like:

ALTER TABLE measurements ADD
CHECK((measurementtypeid <> 1 
          OR (... IS NOT NULL ...)) -- CONSTRAINTS FOR TYPE 1
       AND (measurementtypeid <> 2
          OR (....) -- constraints for type 2
       -- etc
 );

Again this solves your immediate issue but becomes a maintenance problem later

A better approach would be to break off the related columns into two separate join tables as this would avoid this sort of headache.

Chris Travers
  • 25,424
  • 6
  • 65
  • 182
  • 1
    Thangs, this is a good suggestion. However, I already re-designed the database and split data from `measurements` table across four tables, one for each measurement type and a master table that contains only common metadata. – R Kiselev Mar 12 '16 at 13:07