I have a (heavily simplified) orders table, total
being the dollar amount, containing:
| id | client_id | type | total |
|----|-----------|--------|-------|
| 1 | 1 | sale | 100 |
| 2 | 1 | refund | 100 |
| 3 | 1 | refund | 100 |
And clients table containing:
| id | name |
|----|------|
| 1 | test |
I am attempting to create a breakdown, by client, metrics about the total number of sales, refunds, sum of sales, sum of refunds etc.
To do this, I am querying the clients table and joining the orders table. The orders table contains both sales and refunds, specified by the type
column.
My idea was to join the orders twice using subqueries and create aliases for those filtered tables. The aliases would then be used in aggregate functions to find the sum, average etc. I have tried many variations of joining the orders table twice to achieve this but it produces the same incorrect results. This query demonstrates this idea:
SELECT
clients.*,
SUM(sales.total) as total_sales,
SUM(refunds.total) as total_refunds,
AVG(sales.total) as avg_ticket,
COUNT(sales.*) as num_of_sales
FROM clients
LEFT JOIN (SELECT * FROM orders WHERE type = 'sale') as sales
ON sales.client_id = clients.id
LEFT JOIN (SELECT * FROM orders WHERE type = 'refund') as refunds
ON refunds.client_id = clients.id
GROUP BY clients.id
Result:
| id | name | total_sales | total_refunds | avg_ticket | num_of_sales |
|----|------|-------------|---------------|------------|--------------|
| 1 | test | 200 | 200 | 100 | 2 |
Expected result:
| id | name | total_sales | total_refunds | avg_ticket | num_of_sales |
|----|------|-------------|---------------|------------|--------------|
| 1 | test | 100 | 200 | 100 | 1 |
When the second join is included in the query, the rows returned from the first join are returned again with the second join. They are multiplied by the number of rows in the second join. It's clear my understanding of joining and/or subqueries is incomplete.
I understand that I can filter the orders table with each aggregate function. This produces correct results but seems inefficient:
SELECT
clients.*,
SUM(orders.total) FILTER (WHERE type = 'sale') as total_sales,
SUM(orders.total) FILTER (WHERE type = 'refund') as total_refunds,
AVG(orders.total) FILTER (WHERE type = 'sale') as avg_ticket,
COUNT(orders.*) FILTER (WHERE type = 'sale') as num_of_sales
FROM clients
LEFT JOIN orders
on orders.client_id = clients.id
GROUP BY clients.id
What is the appropriate way to created filtered and aliased versions of this joined table?
Also, what exactly is happening with my initial query where the two subqueries are joined. I would expect them to be treated as separate subsets even though they are operating on the same (orders) table.