I am using postgreSQL as my database. I have a table MASTER(A, B, C, D, N1, N2, N3, N4, N5, N6) where the primary key is (A, B, C, D) and N1, N2, N3, N4, N5, N6 are the numeric columns.
I have a query as below to get the summarized data of each A selected from each list in MASTERCOMB.
SELECT MASTERCOM.A
,STATS.sumn1
,STATS.sumn2
,STATS.sumn3
,STATS.sumn4
,STATS.sumn5
,STATS.sumn6
FROM (WITH
sum1 AS (SELECT A, SUM(N1) FROM MASTER WHERE B = $1 GROUP BY A ORDER BY SUM(N1) DESC LIMIT $2),
sum2 AS (SELECT A, SUM(N2) FROM MASTER WHERE B = $1 GROUP BY A ORDER BY SUM(N2) DESC LIMIT $2),
sum3 AS (SELECT A, SUM(N3) FROM MASTER WHERE B = $1 GROUP BY A ORDER BY SUM(N3) DESC LIMIT $2),
sum4 AS (SELECT A, SUM(N4) FROM MASTER WHERE B = $1 GROUP BY A ORDER BY SUM(N4) DESC LIMIT $2),
sum5 AS (SELECT A, SUM(N5) FROM MASTER WHERE B = $1 GROUP BY A ORDER BY SUM(N5) DESC LIMIT $2),
sum6 AS (SELECT A, SUM(N6) FROM MASTER WHERE B = $1 GROUP BY A ORDER BY SUM(N6) DESC LIMIT $2)
SELECT DISTINCT COALESCE(sum1.A, sum2.A, sum3.A, sum4.A, sum5.A, sum6.A) A
FROM sum1
FULL OUTER JOIN sum2 ON sum2.A = sum1.A
FULL OUTER JOIN sum3 ON sum3.A = sum1.A
FULL OUTER JOIN sum4 ON sum4.A = sum1.A
FULL OUTER JOIN sum5 ON sum5.A = sum1.A
FULL OUTER JOIN sum6 ON sum6.A = sum1.A) MASTERCOMB
LEFT JOIN (SELECT A
,SUM(N1) sumn1
,SUM(N2) sumn2
,SUM(N3) sumn3
,SUM(N4) sumn4
,SUM(N5) sumn5
,SUM(N6 sumn6)
FROM MASTER WHERE B = $1 GROUP BY A) AS STATS
ON STATS.A = MASTERCOMB.A
This is just one kind of query with B in the WHERE clause. I may have to query with different combinations like 'WHERE C = $3' OR 'WHERE D = $4'. In rare cases I may have to query with combinations of multiple conditions on B, C and D together;
As the table grows, the performance of the queries could drop. So I am thinking of two aproaches
Approach #1:
- Create Summary Tables SMRY_A_B, SMRY_A_C, SMRY_A_D
- On each insert, update and delete of MASTER table, SUM the values and insert/update/delete respective tables
Approach #2:
- Create a Summary table SMRY_A_B_C_D with primary key (A, B, C, D)
- On each insert, update and delete of MASTER table, SUM the values and insert/update/delete SMRY_A_B_C_D table
possible values for SMRY_A_B_C_D could be
(valA, valB, 'N/A', 'N/A', sumn1, sumn2, sumn3, sumn4, sumn5, sumn6)
(valA, 'N/A, valC, 'N/A', sumn1, sumn2, sumn3, sumn4, sumn5, sumn6)
(valA, 'N/A, 'N/A', 'valD', sumn1, sumn2, sumn3, sumn4, sumn5, sumn6)
Questions:
- Which approach is better to go with?
- Should I not consider both the approaches and query from the master table itself? If so should I optimize the query?