1

I have the following query (built to showcase the problem)

WITH
    CategoryPromotions
    AS
    (
        SELECT CategoryId = 7, Price = 10
             UNION ALL
        SELECT CategoryId = 3, Price = 15
             UNION ALL
        SELECT CategoryId = 1, Price = 5
    )
,
       Products
       AS
       (
             SELECT Id = 1, Price = 20
       )
,
       ProductsCategories
       AS
       (
             SELECT ProductId = 1, CategoryId = 2
             UNION ALL
             SELECT ProductId = 1, CategoryId = 8
             UNION ALL
             SELECT ProductId = 1, CategoryId = 6
       )
,
       Tally
       AS
       (
             SELECT N = 1
             UNION ALL
             SELECT N = 2
             UNION ALL
             SELECT N = 3
             UNION ALL
             SELECT N = 4
             UNION ALL
             SELECT N = 5
       )
,
       Hierarchy
       AS
       (
             SELECT Id = 2, SortPath = 0x00000001000000070000000400000002
             UNION ALL
             SELECT Id = 8, SortPath = 0x00000001000000070000000400000008
             UNION ALL
             SELECT Id = 6, SortPath = 0x0000000300000006
       )
SELECT ProductsCategories.*, xD.*
FROM Products
       RIGHT JOIN ProductsCategories
             ON Products.Id = ProductsCategories.ProductId
       CROSS APPLY
       (
             SELECT TOP (1) promos.CategoryId
                    , Products.Price AS BasePrice
                    , promos.Price
                    , (
                        CASE
                            WHEN promos.Price IS NOT NULL THEN
                                (Products.Price - promos.Price)
                            ELSE
                                Products.Price
                        END
                    ) AS DiscountedPrice
                    , ROW_NUMBER() OVER
                    (
                           ORDER BY CASE
                                  WHEN promos.Price IS NOT NULL THEN
                                        (Products.Price - promos.Price)
                                  ELSE
                                        Products.Price
                                  END
                           ASC
                    ) AS PriceRank
             FROM (SELECT ProductsCategories.ProductId, ProductsCategories.CategoryId) bpc
                    CROSS APPLY
                    (
                           SELECT TOP (1) categories.CategoryId
                                  , catpromo.Price
                           FROM
                                  (
                                        SELECT CategoryId = CAST(SUBSTRING(Hierarchy.SortPath,Tally.N,4) AS INT)
                                               , Tally.N
                                        FROM Hierarchy
                                               INNER JOIN Tally
                                                      ON Tally.N BETWEEN 1
                                                      AND DATALENGTH(Hierarchy.SortPath)
                                        WHERE Hierarchy.Id = bpc.CategoryId
                                        GROUP BY SUBSTRING(Hierarchy.SortPath,tally.N,4)
                                               , tally.n
                                  ) AS categories
                                  INNER JOIN CategoryPromotions catpromo
                                        ON categories.CategoryId = catpromo.CategoryId
                           ORDER BY categories.N DESC
                    ) AS promos
             WHERE bpc.ProductId = 1
             ORDER BY PriceRank
       ) AS XD
WHERE products.Id = 1;

This is the query result:

enter image description here

Why the ROW_NUMBER isn't working? And is there anything I can do in order to improve the query performance? This will be applied to a million row result query for each individual product. I tried to fake +/- the structure that it will be used in.

The desired result is the 1 row that has the lowest DiscountedPrice. (Cannot use MIN, since I need all the columns)

EDIT: Without TOP (1)

enter image description here

LoadIt
  • 137
  • 1
  • 1
  • 9
  • Because you `select TOP(1)` what result are you expecting? – Juan Carlos Oropeza Sep 12 '19 at 20:24
  • If I remove TOP(1) it does not change the end result (see Edit) And I'm expecting what I said in the post (and need help to achieve it) The end result would be in this case the last row since the DiscountedPrice is the lowest – LoadIt Sep 12 '19 at 20:37

1 Answers1

0

The ROW_NUMBER is working just fine. The problem is that you're calculating it for each row in your ProductCategories table. Here's a lighter version of your query.

WITH cteProductsCategoriesDiscounts AS(
    SELECT 
          ProductsCategories.ProductId
        , ProductsCategories.CategoryId
        , promos.CategoryId AS promoCategoryId
        , Products.Price AS BasePrice
        , promos.Price
        , ISNULL(Products.Price - promos.Price, Products.Price) AS DiscountedPrice
        , ROW_NUMBER() OVER ( PARTITION BY ProductsCategories.ProductId ORDER BY ISNULL(Products.Price - promos.Price, Products.Price) ) AS PriceRank
    FROM Products
    RIGHT JOIN ProductsCategories ON Products.Id = ProductsCategories.ProductId
    CROSS APPLY (
                SELECT  TOP (1)
                          CAST(SUBSTRING(Hierarchy.SortPath,N*4-3,4) AS INT) AS CategoryId
                        , catpromo.Price
                FROM Hierarchy
                INNER JOIN Tally t ON t.N BETWEEN t.n AND DATALENGTH(Hierarchy.SortPath)/4
                INNER JOIN CategoryPromotions catpromo ON CAST(SUBSTRING(Hierarchy.SortPath,N*4-3,4) AS INT) = catpromo.CategoryId
                WHERE Hierarchy.Id = ProductsCategories.CategoryId
                ORDER BY t.N
        ) AS promos
    WHERE ProductsCategories.ProductId = 1
)
SELECT *
FROM cteProductsCategoriesDiscounts
WHERE PriceRank = 1;

EDIT: Made an adjustment to allow multiple products.

Luis Cazares
  • 3,495
  • 8
  • 22
  • Hello, thanks for the very fast response. My biggest concern is this line: RIGHT JOIN ProductsCategories ON Products.Id = ProductsCategories.ProductId The actual query is quite big and has quite some stuff going on. It went (with the version I had working) from 10s to 1m:05s after introducing this calculations. I'll try it tomorrow with your version when I have access to the actual database with the data. And that's why I was trying to use the already joined table instead of the right join but will try again tomorrow with this version. Thank you in advance. – LoadIt Sep 12 '19 at 21:12
  • Is the RIGHT JOIN necessary? Shouldn't all products be in Products table? – Luis Cazares Sep 12 '19 at 21:14
  • The actual database have a ProductTable -> ProductCategories -> Category (with self reference parentCategory) -> CategoryPromotions. I have to list all the products from a category and then go up the ladder until I find a promotion see related issue: https://stackoverflow.com/questions/57896495/recursive-cte-while-parent-id-not-in-a-list/57897273#57897273 and then other tables for taxes, internacionalization. It has quite some stuff going on. The biggest test is with one of the cat. that returns 999992 records and I have to calculate everyhting (takes 10s) plus this issue that raised it to 1m+ – LoadIt Sep 12 '19 at 21:19
  • And btw, you forgot (I suppose) the PARTITION BY ProductsCategories.ProductId in the ROW_NUMBER for multiple products – LoadIt Sep 12 '19 at 21:25
  • Yes, I forgot to include it because it wasn't there in your query and I didn't add test date with an additional product. – Luis Cazares Sep 13 '19 at 16:15
  • Srry for the long time. Thank you very mutch. It got me in the track (and ty for the simplification) managed to get it working for the 1 million records with almost no overhead – LoadIt Sep 14 '19 at 12:16