-2

I'm a junior doctor and I'm creating a database system for my senior doctor. Basically, my senior Dr wants to be able to store a whole lot of information on each of his patients in a relational database so that later, he can very easily and quickly analyse / audit the data (i.e. based on certain demographics, find which treatments result in better outcomes or which ethnicities respond better to certain treatments etc. etc.).

The information he wants to store for each patient is huge.

Each patient is to complete 7 surveys (each only takes 1-2 minutes) a number of times (immediately before their operation, immediately postop, 3months postop, 6months postop, 2years postop and 5years postop) - the final scores of each of these survey at these various times will be stored in the database.

Additionally, he wants to store their relevant details (name, ethnicity, gender, age etc etc).

Finally, he also intends to store A LOT of relevant past medical history, current symptoms, examination findings, the various treatment options they try and then outcome measures.

Basically, there's A LOT of info for each patient. And all of this info will be unique to each patient. So because of this, I've created one HUGE patient table with (~400 columns) to contain all of this info. The reason I've done this is because most of the columns in the table will NOT be redundant for each patient.

Additionally, this entire php / mysql database system is only going to live locally on his computer, it'll never be on the internet.

The tables won't have too many patients, maybe around 200 - 300 by the end of the year.

Given all of this, is it ok to have such a huge table? Or should I be splitting it into smaller tables i.e. - Patient demographics - Survey results - Symptoms - Treatments etc. etc, with a unique "patient_id" being the link between each of these tables?

What would be the difference between the 2 approaches and which would be better? Why?

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • I'd just do `patient` and `survey` tables. `survey` can have a patient_id as a foreign key and a survey type for if it was pre-op, post-op etc. You say that the columns will not be redundant but there two problems here. You're creating columns with the intention of filling them 5 years after the row is inserted, and you're not very flexible about adding more surveys at different intervals. – apokryfos Jul 23 '17 at 08:25
  • 1
    @apokryfos and medical histoy and examinations and their results. There are quite a few tables that the OP should create. – Shadow Jul 23 '17 at 08:26
  • @Shadow yes, (I must have missed that part in the first read of the question), a `past_history_entry` table would also be needed having a patient_id and the relevant information in it. Which begs the question on how this would ever be encoded as columns in a table, not all patients would have the same history entries I wouldn't think. – apokryfos Jul 23 '17 at 08:29
  • While I'm not a huge fan of the EAV model, it is the most popular method for storing patient data- precisely because there are endless variables - but if you're sure that there will only be a small number of nulls (this seems really unlikely) then it might not be so appropriate. If you do decide to switch to an EAV model, consider having separate tables for each data type. Whatever you decide, be sure to encrypt the data, and abide by your government's/professional body's requirements in respect of handling and storing patient data. – Strawberry Jul 23 '17 at 08:51

1 Answers1

0

About the 400 columns...

Which, if any, of the columns will be important to search or sort on? Probably very few of them. Keep those few columns as columns.

What will you do with the rest? Probably you simply display them somewhere, using some app code to pretty-print them? So these may as well be in a big JSON string.

This avoids the EAV nightmares, yet stores the data in the database in a format that is actually reasonably easy (and fast) to use.

Rick James
  • 135,179
  • 13
  • 127
  • 222