2

I need to return the top 10 products per year, how can I do this with my following query?

SELECT 
   DP.ProductID
   , DP.Name
   , Year(FS.OrderDate) as TheYear
   , FS.OrderQty
   , FS.OrderAmount
FROM dbo.DimProduct AS DP
LEFT JOIN dbo.FactSales as FS on FS.ProductID = DP.ProductID
Adam Wenger
  • 17,100
  • 6
  • 52
  • 63
Etienne
  • 7,141
  • 42
  • 108
  • 160

3 Answers3

3

This should be easy if your RDBMS supports Window Functions

SELECT  ProductID,
        Name,
        TheYear,
        OrderQty,
        OrderAmount
FROM    
        (
            SELECT  DP.ProductID
                    ,DP.Name
                    ,Year(FS.OrderDate) as TheYear
                    ,FS.OrderQty
                    ,FS.OrderAmount,
                    ,ROW_NUMBER() OVER() (PARTITION BY Year(FS.OrderDate) 
                                        ORDER BY FS.OrderQty DESC) rn
            FROM    dbo.DimProduct AS DP
                    LEFT JOIN dbo.FactSales as FS 
                        on FS.ProductID = DP.ProductID
        ) s
WHERE   rn <= 10
ORDER   BY TheYear

The current query will give you 10 products for every TheYear based on FS.OrderQty since you have not mentioned the criteria on how the records will be sorted out.

The ROW_NUMBER() (a RANKING function) will generate a sequence of number for each group, in this case Year(FS.OrderDate), that is sorted out based on FS.OrderQty. The records will then be filtered out based on the value of the generated sequence.


However, if ROW_NUMBER() will not generate TIE on the records having the same FS.OrderQty. If you want it to be handled, use DENSE_RANK() instead of ROW_NUMBER().

John Woo
  • 258,903
  • 69
  • 498
  • 492
  • 3
    I assume that `TOP 10` product of each year means acording to `OrderQty*OrderAmount`. – Tim Schmelter May 31 '13 at 13:16
  • Probably looking for ORDER BY FS.OrderQty or FS.OrderAmount for top 10 products per year. Think sales reporting. – Norman H May 31 '13 at 13:16
  • @TimSchmelter good point. not sure since the OP did not mentioned about any criteria. – John Woo May 31 '13 at 13:17
  • This is the way to go. However you may have duplicate amounts in which case you may want to use Rank() instead of Row_Number(). Row_Number will give you 10 rows no matter what. Rank may give more than 10 rows if you have a tie in your quantity. For instance, let's say in a race 3 people tie for second place and you say you want the top 2. Using Row_Number you would get the #1 finisher and just one of the #2 finishers (1, random 2). Using Rank you would get the #1 finisher and all three #2 finishers ( 1, 2, 2, 2) – Papa Burgundy May 31 '13 at 13:31
  • @PapaBurgundy thank you for your concern, but in that case, I rather use `DENSE_RANK()` than `RANK()`. – John Woo May 31 '13 at 13:34
  • @JW웃 well it depends on what you're trying to achieve. Let's say that in my example you want the top 3 finishers. Rank would give you 1 and all of the 2s, which is what I want. There would be no 3rd or 4th place finisher. The next one would be 5th. If I were to use Dense_Rank it would return the 4th and 5th place people which is not what I would want. So it just depends on what you're after. Each one has its own purpose. – Papa Burgundy May 31 '13 at 13:41
  • @PapaBurgundy I'm not against you or whatsoever. `:)` My answer is based on what the OP wants. Yes, each of each ranking function has its own purpose. – John Woo May 31 '13 at 13:43
  • 1
    @JW웃 No worries. I'm trying to help the OP as well. He doesn't specify his exact definition of top 10. Row_Number(), Rank(), and Dense_Rank() are all options for him. He'll need to make that decision based on his application. – Papa Burgundy May 31 '13 at 13:49
3

You want to use the function row_number() to get the top 10. This assumes that OrderQty defines the top 10:

select t.*
from (SELECT DP.ProductID, DP.Name, Year(FS.OrderDate) as TheYear, FS.OrderQty, FS.OrderAmount,
             row_number() over (partition by Year(FS.OrderDate)
                                order by fs.OrderAmount desc
                               ) as seqnum
      FROM dbo.DimProduct DP LEFT JOIN
           dbo.FactSales FS
           on FS.ProductID = DP.ProductID
    ) t
where seqnum <= 10;

The function row_number() enumerates rows, starting with 1. It starts over within each group, as defined by the partition by clause (in your case, the year). The ordering of the numbers is based on the order by clause (in your case, fs.OrderAmount desc). So, the ten best products in each year will have the numbers 1-10 and the where clause just chooses them.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0
SELECT 
   DP.ProductID
   , DP.Name
   , Year(FS.OrderDate) as TheYear
   , FS.OrderQty
   , FS.OrderAmount 
   , (FS.OrderQty * FS.OrderAmount) AS FS.Total
FROM dbo.DimProduct AS DP
LEFT JOIN dbo.FactSales as FS on FS.ProductID = DP.ProductID
GROUP BY TheYear, DP.ProductID, FS.Total
ORDER BY FS.Total DESC
WHERE seqnum <= 10;
Gimmy
  • 3,781
  • 2
  • 18
  • 27