I would like to optimize this query:
SELECT "IU" AS Factor,
"GICS Sector/" & t1.[GICS Sector],
#10/6/2015# AS CalcDate,
(Count(t1.[IU]) / Count(*) - 1)*100 AS NaPct,
Avg(t1.[IU]) AS Mean,
StDev(t1.[IU]) AS Sd,
Min(t1.[IU]) AS Low,
(SELECT Max(tp1.[IU])
FROM tbl_DatedModel_2015_1005_0 AS tp1
WHERE tp1.[IU] IN
(SELECT top 25 percent tp2.[IU]
FROM tbl_DatedModel_2015_1005_0 AS tp2
WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[IU])) AS 25Percentile,
(SELECT Max(tp1.[IU])
FROM tbl_DatedModel_2015_1005_0 AS tp1
WHERE tp1.[IU] IN
(SELECT top 50 percent tp2.[IU]
FROM tbl_DatedModel_2015_1005_0 AS tp2
WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[IU])) AS Median,
(SELECT Max(tp1.[IU])
FROM tbl_DatedModel_2015_1005_0 AS tp1
WHERE tp1.[IU] IN
(SELECT top 75 percent tp2.[IU]
FROM tbl_DatedModel_2015_1005_0 AS tp2
WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[IU])) AS Q3,
Max(t1.[IU]) AS High,
((SELECT Max(tp1.[IU])
FROM tbl_DatedModel_2015_1005_0 AS tp1
WHERE tp1.[IU] IN
(SELECT top 75 percent tp2.[IU]
FROM tbl_DatedModel_2015_1005_0 AS tp2
WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[IU])) -
(SELECT Max(tp1.[IU])
FROM tbl_DatedModel_2015_1005_0 AS tp1
WHERE tp1.[IU] IN
(SELECT top 25 percent tp2.[IU]
FROM tbl_DatedModel_2015_1005_0 AS tp2
WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[IU]))) AS IQR,
IIf((Sum((t1.[IU]-z.TheAvg)^2)/Count(t1.[IU])) = 0 OR Count(t1.[IU]) = 0,NULL,Sum((t1.[IU]-z.TheAvg)^4)/Count(t1.[IU]))/(Sum((t1.[IU]-z.TheAvg)^2)/Count(t1.[IU]))^2 AS Kurtosis,
IIf(Count(t1.[IU]) = 0 OR (Sum((t1.[IU]-z.TheAvg)^2)/Count(t1.[IU]))^1.5 = 0,NULL,(Sum((t1.[IU]-z.TheAvg)^3)/Count(t1.[IU]))/(Sum((t1.[IU]-z.TheAvg)^2)/Count(t1.[IU]))^1.5) AS Skewness,
Count(t1.[IU]) AS Obs
FROM tbl_DatedModel_2015_1005_0 AS t1
INNER JOIN
(SELECT t2.[GICS Sector], Avg(t2.[IU]) AS TheAvg
FROM tbl_DatedModel_2015_1005_0 AS t2
GROUP BY t2.[GICS Sector]) AS z ON t1.[GICS Sector] = z.[GICS Sector]
GROUP BY t1.[GICS Sector]
HAVING Count(t1.[IU]) > 0
UNION ALL
SELECT "GM" AS Factor,
"GICS Sector/" & t1.[GICS Sector],
#10/6/2015# AS CalcDate,
(Count(t1.[GM]) / Count(*) - 1)*100 AS NaPct,
Avg(t1.[GM]) AS Mean,
StDev(t1.[GM]) AS Sd,
Min(t1.[GM]) AS Low,
(SELECT Max(tp1.[GM])
FROM tbl_DatedModel_2015_1005_0 AS tp1
WHERE tp1.[GM] IN
(SELECT top 25 percent tp2.[GM]
FROM tbl_DatedModel_2015_1005_0 AS tp2
WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[GM])) AS 25Percentile,
(SELECT Max(tp1.[GM])
FROM tbl_DatedModel_2015_1005_0 AS tp1
WHERE tp1.[GM] IN
(SELECT top 50 percent tp2.[GM]
FROM tbl_DatedModel_2015_1005_0 AS tp2
WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[GM])) AS Median,
(SELECT Max(tp1.[GM])
FROM tbl_DatedModel_2015_1005_0 AS tp1
WHERE tp1.[GM] IN
(SELECT top 75 percent tp2.[GM]
FROM tbl_DatedModel_2015_1005_0 AS tp2
WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[GM])) AS Q3,
Max(t1.[GM]) AS High,
((SELECT Max(tp1.[GM])
FROM tbl_DatedModel_2015_1005_0 AS tp1
WHERE tp1.[GM] IN
(SELECT top 75 percent tp2.[GM]
FROM tbl_DatedModel_2015_1005_0 AS tp2
WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[GM])) -
(SELECT Max(tp1.[GM])
FROM tbl_DatedModel_2015_1005_0 AS tp1
WHERE tp1.[GM] IN
(SELECT top 25 percent tp2.[GM]
FROM tbl_DatedModel_2015_1005_0 AS tp2
WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[GM]))) AS IQR,
IIf((Sum((t1.[GM]-z.TheAvg)^2)/Count(t1.[GM])) = 0 OR Count(t1.[GM]) = 0,NULL,Sum((t1.[GM]-z.TheAvg)^4)/Count(t1.[GM]))/(Sum((t1.[GM]-z.TheAvg)^2)/Count(t1.[GM]))^2 AS Kurtosis,
IIf(Count(t1.[GM]) = 0 OR (Sum((t1.[GM]-z.TheAvg)^2)/Count(t1.[GM]))^1.5 = 0,NULL,(Sum((t1.[GM]-z.TheAvg)^3)/Count(t1.[GM]))/(Sum((t1.[GM]-z.TheAvg)^2)/Count(t1.[GM]))^1.5) AS Skewness,
Count(t1.[GM]) AS Obs
FROM tbl_DatedModel_2015_1005_0 AS t1
INNER JOIN
(SELECT t2.[GICS Sector], Avg(t2.[GM]) AS TheAvg
FROM tbl_DatedModel_2015_1005_0 AS t2
GROUP BY t2.[GICS Sector]) AS z ON t1.[GICS Sector] = z.[GICS Sector]
GROUP BY t1.[GICS Sector]
HAVING Count(t1.[GM]) > 0
Notes:
- Above and below the UNION ALL is identical except for the value being calculated. In the code above, the values being calculated are IU and GM.
- This is just sample code. In reality, there are several UNION ALL statements, but omitted them here for simplicity. The query is built dynamically in VBA.
- I cannot remove the calculations, but I am open to modifying/simplifying them.
- I think an easy'ish way to optimize this is the bring in an index for "GICS Sector" for the WHERE statements. Confirmation and sql examples on this would be useful to me.
- This must work in MS Access 2013.
- All SQL. I.e. no VBA please.
- Although a soild SQL answer is much preferred, conceptual ideas are welcome.
- I realize some optimization is limited because you don't know my data structure.
- Nothing in my query is an ID (at this point).
- I have read this: https://support.microsoft.com/en-us/kb/209126