0

I'm having such trouble visualizing this and need some help. Tasked with finding items that are most frequently purchased together. Like what do customers usually add to their purchase. And how many times has that occurred.

All the data is in 1 table, with the following columns:

  • Order Number
  • Item Code
  • Color Code
  • Size
  • Item Description

Each row in this table not necessarily a unique purchase - for example, it can have order number 1 listed twice because they made a purchase of Item A and Item B, hence 2 rows.

I know this may not be the best explanation so please let me know if you have questions.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
Z41N
  • 97
  • 10

1 Answers1

1

You can get the number of orders that two items are in using a self-join and aggregation:

select t1.itemcode, t2.itemcode, count(distinct t1.ordernumber) as num_orders
from t t1 join
     t t2
     on t1.ordernumber = t2.ordernumber and
        t1.itemcode < t2.itemcode
group by t1.itemcode, t2.itemcode
order by num_orders desc;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Thanks, Gordon. Now if I have a few conditions that I typically put in the WHERE clause, where exactly does that go? Before JOIN or before the GROUP BY? – Z41N Jan 31 '20 at 15:13
  • @zain . . . The `WHERE` clause always goes after the `from` and before the `group by` (if any). – Gordon Linoff Jan 31 '20 at 17:02