0

Well, the query I need is simple, and maybe is in another question, but there is a performance thing in what I need, so:

I have a table of users with 10.000 rows, the table contains id, email and more data.

In another table called orders I have way more rows, maybe 150.000 rows.

In this orders I have the id of the user that made the order, and also a status of the order. The status could be a number from 0 to 9 (or null).

My final requirement is to have every user with the id, email, some other column , and the number of orders with status 3 or 7. it does not care of its 3 or 7, I just need the amount

But I need to do this query in a low-impact way (or a performant way).

What is the best approach?

I need to run this in a redash with postgres 10.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
burdk
  • 79
  • 12

1 Answers1

2

This sounds like a join and group by:

select u.*, count(*)
from users u join
     orders o
     on o.user_id = u.user_id
where o.status in (3, 7)
group by u.user_id;

Postgres is usually pretty good about optimizing these queries -- and the above assumes that users(user_id) is the primary key -- so this should work pretty well.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Could it be even faster if `o.status in (3, 7)` is moved inside the `join` clause, either as is, or expanded to `(o.status == 3 or o.status == 7)`? Is there a way to predict which is faster without `explain analyze`? – Timur Shtatland Aug 14 '20 at 15:28
  • 2
    It will be the same, performance-wise. Postgres is smart enough to treat all these as equivalent – Ruben Helsloot Aug 14 '20 at 16:01
  • and what about in cases where I want to do it as excluding, I mean when i want those who are not status 3? WHERE o.status IN (<>) – burdk Aug 14 '20 at 19:46