0

I'm creating a price comparison service. Products from one Site are compared to products from one or more Sites. Products are matched from one Site to another using a ProductMatch table:

UML Diagram

Given the following query to extract products, together with their matches:

SELECT
    p1.id AS p1_id, p1.name AS p1_name, p1.price AS p1_price,
    p2.id AS p2_id, p2.name AS p2_name, p2.price AS p2_price,
    m.time
FROM Product p1
LEFT JOIN ProductMatch m ON m.fromProduct_id = p1.id
LEFT JOIN Product p2 ON m.toProduct_id = p2.id

WHERE p1.site_id = 1;

How can I filter products whose price (p1.price) is lower than the minimum competitor price (MIN(p2.price))?

Using subqueries, here's how I would do it:

SELECT
    p1.id AS p1_id, p1.name AS p1_name, p1.price AS p1_price,
    p2.id AS p2_id, p2.name AS p2_name, p2.price AS p2_price,
    m.time
FROM Product p1
LEFT JOIN ProductMatch m ON m.fromProduct_id = p1.id
LEFT JOIN Product p2 ON m.toProduct_id = p2.id

WHERE p1.id IN (
    SELECT x.id FROM (
        SELECT _p1.id, _p1.price
        FROM Product _p1
        JOIN ProductMatch _m ON _m.fromProduct_id = _p1.id
        JOIN Product _p2 ON _m.toProduct_id = _p2.id
        WHERE _p1.site_id = 1
        GROUP BY _p1.id
        HAVING _p1.price < MIN(_p2.price)
    ) x
);

Is it possible to simplify this query to not use subqueries?

My concerns:

  • it feels weird to repeat the exact same joins in the subquery
  • I have concerns about the performance of subqueries on larger data sets
  • subqueries don't play very well with my ORM
BenMorel
  • 34,448
  • 50
  • 182
  • 322
  • Please provide sample data and desired results. Can there be chains of matching products: (1, 2), (2, 3), (3, 4) without having all combos? – Gordon Linoff Feb 19 '20 at 15:40
  • MySQL 8 supports *Windowed Aggregates*, check your version – dnoeth Feb 19 '20 at 15:41
  • @GordonLinoff There cannot be chains of matching products **for a given site**. One Product from one Site will match at most one Product from another Site, and we'll always query base products for a single Site. I've edited my question with a (too complicated) query that returns the expected result, hopefully this will clarify the question. – BenMorel Feb 19 '20 at 15:53
  • @dnoeth I'm using MySQL 8.0. Can you explain how to rewrite this query using windowed aggregates? – BenMorel Feb 19 '20 at 15:53

1 Answers1

1

With MIN() window function inside a CTE which will be filtered:

WITH cte AS (
  SELECT
    p1.id AS p1_id, p1.name AS p1_name, p1.price AS p1_price,
    p2.id AS p2_id, p2.name AS p2_name, p2.price AS p2_price,
    m.time,
    MIN(p2.price) OVER (PARTITION BY p1.id) AS min_price
  FROM Product p1
  LEFT JOIN ProductMatch m ON m.fromProduct_id = p1.id
  LEFT JOIN Product p2 ON m.toProduct_id = p2.id
  WHERE p1.site_id = 1
)
SELECT 
  p1_id, p1_name, p1_price,
  p2_id, p2_name, p2_price,
  time
FROM cte
WHERE p1_price < min_price
forpas
  • 160,666
  • 10
  • 38
  • 76
  • But then I'm only getting the products that match the filters (`p1`), and not the compared products (`p2`) in the same query. Your query is basically the contents of my subquery! – BenMorel Feb 19 '20 at 16:13
  • Thank you for an example of using a window function. I played a bit with them, and it looks like window functions are not allowed in a `WHERE` or `HAVING` clause, this is I guess what forced you to using a CTE on top of it. This checks 2 out of 3 boxes: it doesn't repeat the joins, and I guess this makes it better from a performance point of view. The only worry I have is that it's as hard as the subquery to translate to an ORM, but I guess this is as good as it can be! – BenMorel Feb 19 '20 at 16:52
  • In this case the CTE is not really needed. The code inside it can be used and and be filtered but with a CTE the code is more readable and if needed reusable. – forpas Feb 19 '20 at 17:00
  • I think you actually do need a CTE. I tried to `WHERE|HAVING` `p1_price < min_price` in the inner query, and it fails, for [this reason](https://stackoverflow.com/q/13997177/759866). Did I miss something? – BenMorel Feb 19 '20 at 17:13
  • What I meant is that if you use in the main body of the query instead of `...FROM cte...`: `...FROM () AS c ...` then you can do: `WHERE c.p1_price < c.min_price` – forpas Feb 19 '20 at 17:17
  • Oh sure, but then you have a subquery! – BenMorel Feb 19 '20 at 17:32
  • Yes but but you don't repeat the same query in the subquery. You just set a filter. – forpas Feb 19 '20 at 17:34