I have a dataset of purchases with customer, product and category.
customer product category sales_value
A aerosol air_care 10
B aerosol air_care 12
C aerosol air_care 7
A perfume air_care 8
A perfume air_care 2
D perfume air_care 11
C burger food 13
D fries food 6
C fries food 9
I want, for each product, the ratio between the sales value spent on this product, and the sales value spent on this product's category, by the customers who bought the product at least once.
Another way to say it: take customers who bought fries
at least once, and for all of them, compute A) the sum of sales value spent on fries
and B) the sum of sales value spent on food
.
An intermediate table would be of the following form:
product category sum_spent_on_product sum_spent_on_category ratio
by_people_buying_product
aerosol air_care 29 39 0.74
perfume air_care 21 31 0.68
burger food 13 22 0.59
fries food 15 28 0.53
Example: people having bought aerosol
at least once, spent a total of 1800 on this product. The same people, overall, spent 3600 on the air_care
category (which aerosol
belongs to). Thus, the ratio for aerosol
is 0.5.
I've tried to solve this using left join lateral
and compute the given intermediate results for each product
, but I can't get to wrap my head around how to include the condition only for customers who bought this specific product
:
select
distinct (product_id)
, category
, c.sales_category
from transactions t
left join lateral (
select
sum(sales_value) as sales_category
from transactions
where category = t.category
group by category
) c on true
;
The above query lists the sum of spent on the product's category for each product, but without the required product-buyer condition.
Is left join lateral
the right way to go? Is there any other solution in plain SQL?