I have a table daily_data to save entries, sales, hours(can be 100 types) of stores. My plan is to calculate conversion, entries/hour, sales/hour, ... of many stores(can be 1000 stores).
It can be data of 1 day, 2 days or months, years.
id_pos | date | data_type | value |
---|---|---|---|
1 | 2021-11-01 | 1 | 100 |
1 | 2021-11-01 | 2 | 20 |
1 | 2021-11-01 | 3 | 8 |
2 | 2021-11-02 | 1 | 50 |
2 | 2021-11-02 | 2 | 10 |
2 | 2021-11-02 | 3 | 8 |
... | ... | ... | ... |
data_type column explanation:
- entries
- sales
- hours
*id_pos: ID of store location
My query:
select id_pos,data_type, sum(value) as data
from daily_data
where id_pos IN (1,2) AND date>='2021-11-01' AND date<='2021-11-30'
group by id_pos,data_type
After execute above query I handle result in coding to calculate Conversion(sales/entries), Sales per hour, Entries per hour, ... (handle formula with many exception cases).
| id_pos | data_type | value |
| 1 | conversion | 20% |
| 1 | entries per hour | 20.1 |
| 1 | sales per hour | 5.6 |
The question:
This structure will be enough with small data, but when I have 10-20 mil of rows then the query performance will be not good!
So my plan is to make a new "total_data" table, to save data of conversion, entries per hour, sale per hour(after execute above query). by this way I can directly select the final data without handle any formula in coding.
id_pos | date_begin | date_end | data_type | value |
---|---|---|---|---|
1 | 2021-11-01 | 2021-11-30 | conversion | 10 |
1 | 2021-11-01 | 2021-11-30 | entries/hour | 20.1 |
1 | 2021-11-01 | 2021-11-30 | sales/hour | 5.1 |
2 | 2021-11-01 | 2021-11-30 | conversion | 22 |
2 | 2021-11-01 | 2021-11-30 | entries/hour | 6.2 |
2 | 2021-11-01 | 2021-11-30 | sales/hour | 6.5 |
But the problem is the the date range, its can be 2021-11-02 to 2021-11-05 or thousand of cases. How can I make a properly "data_total" table?