0

I have following query:

WITH relationships AS (
    SELECT related_user_id, count(*) AS trade_count
    FROM trade_history
    WHERE user_id = 487834568 
    GROUP BY related_user_id
    ORDER BY trade_count DESC
)
SELECT offers.*,
       relationships.trade_count
FROM offers
         LEFT JOIN user_stock
                   ON user_stock.user_id = 487834568 and offers.product_id = user_stock.product_id
         LEFT JOIN relationships
                   ON offers.user_id = relationships.related_user_id
WHERE offers.state = 'OPEN'
  AND offers.user_id != 487834568        
  AND offers.group BETWEEN 1 AND 3 
ORDER BY offers.created_at,
         relationships.trade_count DESC,
         user_stock.amount NULLS FIRST;

The query shows me all offers and orders them by:

  • Show older offers first
  • Show offers with a higher trade count first (two user traded which each other)
  • User stock

What I need in addition is to limit the results by max 3 per product_id. I did some googling and figured out that this should be possible by a window function using row_number() and lateral joins. I don't want to use row_number() as the tables contain a lot of entries and I would run into performance issues most likely. I guess lateral joins are the right tool (read here http://charlesnagy.info/it/postgresql/group-by-limit-per-group-in-postgresql) but I could not get a decent result with my limited SQL knowledge.

How do I get the grouping in a performant way?

Edit: To make it more clear, I implemented the solution with a window function:

WITH relationships AS (
    SELECT related_user_id, count(*) AS trade_count
    FROM trade_history
    WHERE user_id = 487834568 
    GROUP BY related_user_id
    ORDER BY trade_count DESC
)
SELECT * FROM (
    SELECT 
           offers.*,
           relationships.trade_count,
           row_number() OVER (
               PARTITION BY resource_id
               ORDER BY 
                   offers.created_at,
                   relationships.trade_count DESC,
                   user_stock.amount NULLS FIRST
        ) AS row_number
    FROM offers
             LEFT JOIN user_stock
                       ON user_stock.user_id = 487834568 and offers.product_id = user_stock.product_id
             LEFT JOIN relationships
                       ON offers.user_id = relationships.related_user_id
    WHERE offers.state = 'OPEN'
      AND offers.user_id != 487834568        
      AND offers.group BETWEEN 1 AND 3 
                  ORDER BY row_number
              ) AS ordered_offers
WHERE ordered_offers.row_number <= 3;
messy
  • 915
  • 6
  • 26

1 Answers1

1

If you can add some grouping there, add HAVING count(offers.product_id) <= 3 between WHERE and ORDER BY

Putting it all together, this feels right. I'm going to try a similar grouping based on a different local DB and see if it works as I expect, will update if it doesn't.

WITH relationships AS (
  SELECT related_user_id, count(*) AS trade_count
  FROM trade_history
  WHERE user_id = 487834568 
  GROUP BY related_user_id
  ORDER BY trade_count DESC
)
SELECT
  offers.*,
  relationships.trade_count
FROM offers
LEFT JOIN user_stock ON
  user_stock.user_id = 487834568 AND
  offers.product_id = user_stock.product_id
LEFT JOIN relationships ON
  offers.user_id = relationships.related_user_id
LEFT JOIN (
  SELECT array_agg(offers.id) AS ids
  FROM offers
  JOIN user_stock ON
    user_stock.user_id = 487834568 AND
    offers.user_id = user_stock.user_id AND
    offers.product_id = user_stock.product_id
  GROUP BY offers.user_id
  HAVING count(offers.id) <= 3
) AS offers_limit ON
  offers.id = any(offers_limit.ids)
WHERE
  offers.state = 'OPEN' AND
  offers.user_id != 487834568 AND
  offers.group BETWEEN 1 AND 3
ORDER BY
  offers.created_at,
  relationships.trade_count DESC,
  user_stock.amount NULLS FIRST
mike.k
  • 3,277
  • 1
  • 12
  • 18
  • That's also something I already tried, but I would have to adjust the `GROUP BY` clause and would not have the columns selected which I want to have. At least in my understanding. `[42803] ERROR: column "offers.id" must appear in the GROUP BY clause or be used in an aggregate function` – messy Sep 25 '20 at 14:29
  • @messy perhaps join `offers` to itself on `id`, and use the `HAVING` clause only on the joined table, with limited columns selected. `JOIN offers AS offers_limit ON offers_limit.id = offers.id AND offers_limit.product_id = user_stock.product_id` ... `GROUP BY offers_limit.product_id` ... `HAVING count(offers_limit.id) <= 3` – mike.k Sep 25 '20 at 14:38
  • Is there a reason you replaced the `LEFT JOIN` with `JOIN`? Sadly your query doesn't give the result I need. It just returns nothing. I'll simplify my query and give it another try in the next days. – messy Sep 26 '20 at 07:14
  • A `LEFT JOIN` would give you results where the right tables don't meet the criteria, it doesn't seem like what you want. Can you confirm that there is data which should be returned, perhaps you can loosen the 3 value to something larger and check it. – mike.k Sep 26 '20 at 16:27
  • Yes, I need the data were the right tables don't meet the criteria. I was not able to get it done other than using `row_numbers()` (see edited post). – messy Sep 28 '20 at 08:22
  • @messy try it with LEFT JOIN then, I'll edit the answer. – mike.k Sep 28 '20 at 14:44
  • I did. But the result was empty. At the moment I went with the `row_number` solution which I posted in my original post. – messy Sep 30 '20 at 08:08