1

I would like to optimize this query:

SELECT "IU" AS Factor, 
        "GICS Sector/" & t1.[GICS Sector], 
        #10/6/2015# AS CalcDate, 
        (Count(t1.[IU]) / Count(*) - 1)*100 AS NaPct, 
        Avg(t1.[IU]) AS Mean, 
        StDev(t1.[IU]) AS Sd, 
        Min(t1.[IU]) AS Low, 
         (SELECT Max(tp1.[IU]) 
         FROM tbl_DatedModel_2015_1005_0 AS tp1 
         WHERE tp1.[IU] IN 
             (SELECT top 25 percent tp2.[IU] 
             FROM tbl_DatedModel_2015_1005_0 AS tp2 
             WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[IU])) AS 25Percentile, 
         (SELECT Max(tp1.[IU]) 
         FROM tbl_DatedModel_2015_1005_0 AS tp1 
         WHERE tp1.[IU] IN 
             (SELECT top 50 percent tp2.[IU] 
             FROM tbl_DatedModel_2015_1005_0 AS tp2 
             WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[IU])) AS Median, 
         (SELECT Max(tp1.[IU]) 
         FROM tbl_DatedModel_2015_1005_0 AS tp1 
         WHERE tp1.[IU] IN 
             (SELECT top 75 percent tp2.[IU] 
             FROM tbl_DatedModel_2015_1005_0 AS tp2 
             WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[IU])) AS Q3, 
        Max(t1.[IU]) AS High, 
         ((SELECT Max(tp1.[IU]) 
         FROM tbl_DatedModel_2015_1005_0 AS tp1 
         WHERE tp1.[IU] IN 
             (SELECT top 75 percent tp2.[IU] 
             FROM tbl_DatedModel_2015_1005_0 AS tp2 
             WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[IU])) - 
         (SELECT Max(tp1.[IU]) 
         FROM tbl_DatedModel_2015_1005_0 AS tp1 
         WHERE tp1.[IU] IN 
             (SELECT top 25 percent tp2.[IU] 
             FROM tbl_DatedModel_2015_1005_0 AS tp2 
             WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[IU]))) AS IQR, 
        IIf((Sum((t1.[IU]-z.TheAvg)^2)/Count(t1.[IU])) = 0 OR Count(t1.[IU]) = 0,NULL,Sum((t1.[IU]-z.TheAvg)^4)/Count(t1.[IU]))/(Sum((t1.[IU]-z.TheAvg)^2)/Count(t1.[IU]))^2 AS Kurtosis, 
        IIf(Count(t1.[IU]) = 0 OR (Sum((t1.[IU]-z.TheAvg)^2)/Count(t1.[IU]))^1.5 = 0,NULL,(Sum((t1.[IU]-z.TheAvg)^3)/Count(t1.[IU]))/(Sum((t1.[IU]-z.TheAvg)^2)/Count(t1.[IU]))^1.5) AS Skewness, 
        Count(t1.[IU]) AS Obs 
FROM tbl_DatedModel_2015_1005_0 AS t1 
        INNER JOIN 
                  (SELECT t2.[GICS Sector], Avg(t2.[IU]) AS TheAvg 
                  FROM tbl_DatedModel_2015_1005_0 AS t2 
                  GROUP BY t2.[GICS Sector]) AS z ON t1.[GICS Sector] = z.[GICS Sector] 
GROUP BY t1.[GICS Sector] 
HAVING Count(t1.[IU]) > 0 
UNION ALL 
SELECT "GM" AS Factor, 
        "GICS Sector/" & t1.[GICS Sector], 
        #10/6/2015# AS CalcDate, 
        (Count(t1.[GM]) / Count(*) - 1)*100 AS NaPct, 
        Avg(t1.[GM]) AS Mean, 
        StDev(t1.[GM]) AS Sd, 
        Min(t1.[GM]) AS Low, 
         (SELECT Max(tp1.[GM]) 
         FROM tbl_DatedModel_2015_1005_0 AS tp1 
         WHERE tp1.[GM] IN 
             (SELECT top 25 percent tp2.[GM] 
             FROM tbl_DatedModel_2015_1005_0 AS tp2 
             WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[GM])) AS 25Percentile, 
         (SELECT Max(tp1.[GM]) 
         FROM tbl_DatedModel_2015_1005_0 AS tp1 
         WHERE tp1.[GM] IN 
             (SELECT top 50 percent tp2.[GM] 
             FROM tbl_DatedModel_2015_1005_0 AS tp2 
             WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[GM])) AS Median, 
         (SELECT Max(tp1.[GM]) 
         FROM tbl_DatedModel_2015_1005_0 AS tp1 
         WHERE tp1.[GM] IN 
             (SELECT top 75 percent tp2.[GM] 
             FROM tbl_DatedModel_2015_1005_0 AS tp2 
             WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[GM])) AS Q3, 
        Max(t1.[GM]) AS High, 
         ((SELECT Max(tp1.[GM]) 
         FROM tbl_DatedModel_2015_1005_0 AS tp1 
         WHERE tp1.[GM] IN 
             (SELECT top 75 percent tp2.[GM] 
             FROM tbl_DatedModel_2015_1005_0 AS tp2 
             WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[GM])) - 
         (SELECT Max(tp1.[GM]) 
         FROM tbl_DatedModel_2015_1005_0 AS tp1 
         WHERE tp1.[GM] IN 
             (SELECT top 25 percent tp2.[GM] 
             FROM tbl_DatedModel_2015_1005_0 AS tp2 
             WHERE t1.[GICS Sector] = tp2.[GICS Sector] ORDER BY tp2.[GM]))) AS IQR, 
        IIf((Sum((t1.[GM]-z.TheAvg)^2)/Count(t1.[GM])) = 0 OR Count(t1.[GM]) = 0,NULL,Sum((t1.[GM]-z.TheAvg)^4)/Count(t1.[GM]))/(Sum((t1.[GM]-z.TheAvg)^2)/Count(t1.[GM]))^2 AS Kurtosis, 
        IIf(Count(t1.[GM]) = 0 OR (Sum((t1.[GM]-z.TheAvg)^2)/Count(t1.[GM]))^1.5 = 0,NULL,(Sum((t1.[GM]-z.TheAvg)^3)/Count(t1.[GM]))/(Sum((t1.[GM]-z.TheAvg)^2)/Count(t1.[GM]))^1.5) AS Skewness, 
        Count(t1.[GM]) AS Obs 
FROM tbl_DatedModel_2015_1005_0 AS t1 
        INNER JOIN 
                  (SELECT t2.[GICS Sector], Avg(t2.[GM]) AS TheAvg 
                  FROM tbl_DatedModel_2015_1005_0 AS t2 
                  GROUP BY t2.[GICS Sector]) AS z ON t1.[GICS Sector] = z.[GICS Sector] 
GROUP BY t1.[GICS Sector] 
HAVING Count(t1.[GM]) > 0

Notes:

  • Above and below the UNION ALL is identical except for the value being calculated. In the code above, the values being calculated are IU and GM.
  • This is just sample code. In reality, there are several UNION ALL statements, but omitted them here for simplicity. The query is built dynamically in VBA.
  • I cannot remove the calculations, but I am open to modifying/simplifying them.
  • I think an easy'ish way to optimize this is the bring in an index for "GICS Sector" for the WHERE statements. Confirmation and sql examples on this would be useful to me.
  • This must work in MS Access 2013.
  • All SQL. I.e. no VBA please.
  • Although a soild SQL answer is much preferred, conceptual ideas are welcome.
  • I realize some optimization is limited because you don't know my data structure.
  • Nothing in my query is an ID (at this point).
  • I have read this: https://support.microsoft.com/en-us/kb/209126
mountainclimber11
  • 1,339
  • 1
  • 28
  • 51
  • 1
    I've done one or two things like this in the past. It's really the Union that slows things down. If you have a lot of Unions, my experience has been that's it's quicker to run each query independently, append the results to a table, then query the table. – Gene Oct 08 '15 at 15:56
  • @Gene - Thank you! Your suggestion is faster than what I put in the OP (so a good suggestion), but still super slow. Its actually faster to just calc each record individually via Excel object member functions and add them one at a time to my final table. – mountainclimber11 Nov 06 '15 at 05:55
  • Possible duplicate of [Optimize access query with nested IIF's](https://stackoverflow.com/questions/6618135/optimize-access-query-with-nested-iifs) – Paul Sweatte Jun 05 '17 at 22:35

0 Answers0