0

Currently we have 12 different databases and 7 of them are dimensional. We are a non-profit knowledge based org where we have databases based on kind of disease the person has.

eg. our databases look like

  1. HIV
  2. Hepatitis C
  3. Meningitis

and so on...

Each of these would have data with tables such as:

Patient

Sample( blood samples)

location

diagnosis

Gender

Provider

We dont keep track on how much money was spent as we just keep track of +ve and -ve samples.

Now, question has come into upper management that we should build a Datawarehouse from the silo Data marts.

But, business users have never asked a question where they would need data from two different databases. Do we still need DW if users have not even thought about it?

Some more questions which came in my mind were:

  1. What kind of granularity for each of those datamarts?
  2. Which dimension could act as Conformed Dimension?
  3. How would the ETL flow?
  4. Achieve the single version of truth across all the DM's?

I am just doing an initiative to understand what could be a solution to the situation we are in. Any help is appreciated.

Thanks

Jerry
  • 127
  • 1
  • 9
  • I think you might want to take a step or two back and ask, "what does the business need from the data that it can't get today, or is hard to get?" Once you have an understanding of that, you can begin to consider an implementation. A DW might be the right choice but that's not certain. – user1443098 Apr 24 '19 at 17:40
  • I asked my manager and she has no clue. Asked colleagues if users ever asked for something which we never could answer and ever needed something like " if this person has Meningitis, did he have any respiratory illness too ?" , however, based on datamarts above and considering we use sample counts as fact, which dimension could act as conformed dimension to gel every DM together.( Location, Age, Date, Provider?) – Jerry Apr 24 '19 at 17:46
  • A natural might be a person dimension. the fact tables could refer to it. Probably too simple but maybe enough to get you started. – user1443098 Apr 24 '19 at 17:56
  • Hmm. That would mean each patient dimension right now turning into just one dimension with all the patient info. So ETL would change for compliance as well. I will look into that. Thanks – Jerry Apr 24 '19 at 18:15

2 Answers2

1

The main reason to keep all your datamarts in the same location, a data warehouse, is to be able to track the same dimensions across different datamarts.

In your example I see at least the patient, provider and disease/diagnosis dimensions that could be fed by the different data sets and keep a single version of each element.

Your data integration routines will need to be updated to ensure proper updates across all dimensions. Plus you will need to set up the data warehouse (which, if your data is small, a single node postgres server should be more than enough). If those costs are acceptable given the convenience of having consistent data across all data marts and being able to cross query, then go for it.

But, as you say, you don't see the business case anywhere. So, aren't you trying to fix something that isn't broken? Maybe leave it as is until the need arises and then evaluate the cost/benefit ratio of such a move.

nsousa
  • 4,448
  • 1
  • 10
  • 15
  • Thank you for your response. I understand that ETL routines have to updated based on the model we build for DW. And we are a microsoft shop and we are running 2012 SQL server. And I would like to leave it as is and I have asked the same question to seniors about the business use of the DW and what if we leave it as it is. They dont have answers but want to think about implementing one. I took an initiative to see what could be done before we jump to making one without any knowledge of the use of the DW. – Jerry Apr 25 '19 at 12:39
0

One reason I can think of for building a Data Warehouse here is if you want to "archive" old data that is no longer going to be needed on a regular basis in the Data Marts.

The other reason, which is already mentioned in comments, is if there's going to be a need for Enterprise-wide reporting (maybe in the case of auditing by an external party). You don't mention how big your Enterprise is, but I get the impression it's not huge, and so this probably isn't going to be something I would treat as a driving factor in your decision.

Tab Alleman
  • 31,483
  • 7
  • 36
  • 52