0

Currently, I am building fact table base my ODS tables and dimension tables. My ODS Table like

1.dateWiseData

sid_date

total_impressions

2.devices(Mobile PHONE,PC,etc)

sid_date

device_id

device_name

total_impressions

3.AdvertiserData

sid_date

advertiser_id

advertiser_name

total_impressions

Fact Table

id

sid_date

devicesID

advertiserID

total_impressions

Here, We facing issue to join all three table data in single fact table. In this case we have three different total impression on each table. But in fact table we just add only single total impression. How we can calculate it and join then. We tried different join technicians but didn't find perfect solution.

Please help us handle this case

Gaurav Ashara
  • 462
  • 2
  • 6
  • 16
  • You need to define "single total impression". What do you mean? This then dictates how your data is to be transformed. You should post some of your attempts so far and clearly explain why they do not meet your requirements. Your explanation so far is unclear - you have explained neither the business requirements or the technical attempts. – Nick.Mc Feb 11 '16 at 01:38

1 Answers1

0

Are you trying to build a dimensional (Kimball) data warehouse? If so, there are some changes you can make.

(1) You don't need dateWiseData.

(2) You don't need total_impressions on devices.

(3) You don't need total_impressions on advertiserData.

Might I suggest that your model should look like more like this:

calendar (id, cal_date, cal_year, cal_month, ... )
device (id, name)
advertiser (id, name)
impression (calendar_id, device_id, advertiser_id, impression_count)

This assumes that a fact can have more than one impression. If this is not correct, then you have what is known as a "factless fact", and the table should look like this:

impression (calendar_id, device_id, advertiser_id)

Now you can run queries that, for example, calculate the total impressions by device on a given date:

select device.name,count(*)
from   impression
       inner join device on device.id = device_id
       inner join calendar on calendar.id = calendar_id
where  calendar.cal_date = '2016-02-08'
group  by device.name;

Does that meet your requirement?

EDIT: Respond to question, below.

In that case you want the first form of the fact table:

impression (calendar_id, device_id, advertiser_id, impression_count)

and your query to find impressions by advertiser looks like this:

select advertiser.name,sum(impression_count)
from   impression
       inner join advertiser on advertiser.id = advertiser_id
       inner join calendar on calendar.id = calendar_id
where  calendar.cal_date = '2016-02-08'
group  by advertiser.name;

If you attempt to carry totals anywhere else you will end up in a world of pain.

Ron Dunn
  • 2,971
  • 20
  • 27
  • Hello @Ron, Thank You for reply Actually we have different impression base on advertiser,devices.We didn't got with count. In fact table we have following columns Fact Table id sid_date devicesID advertiserID total_impressions. So total impression will be very complex to mange with fact table i want advertiser wise impression (Group By advertiser_ID) or If i want devices total impression its also produce (Group By Devices ID ). – Gaurav Ashara Feb 10 '16 at 14:22
  • I've updated the answer to reflect your question. This is not "complex", it is standard dimensional modelling. – Ron Dunn Feb 11 '16 at 01:30