1

Input: Have got Table 'A'

Store   Category   Sales   Day
11      aaa        1.5     Sunday
11      aaa        0.5     Monday
11      aaa        2.5     Tuesday
11      aaa        2.0     Wednesday
11      aaa        3.0     Thursday
11      aaa        3.5     Friday
11      aaa        0.5     Saturday
22      bbb        0.5     Sunday
22      bbb        1.5     Monday
22      bbb        2.3     Tuesday
22      bbb        0.3     Wednesday
22      bbb        1.4     Thursday
22      bbb        4.1     Friday
22      bbb        0.2     Saturday

Scenario: Have to take average of Sales grouped by Store,Category and save in separate column as well as pick top 3 sales day and save in separate column. As a result, one row for one store,category as shown below in sample output.

Expected Output:

Store   Category   AvgSales   PeakDay1   PeakDay2   PeakDay3   
11      aaa        1.92       Friday     Thursday   Tuesday
22      bbb        1.47       Friday     Tuesday    Monday

Tried Query:

SELECT
  Store,
  Category,
  avg(Sales) as AvgSales,
  ARRAY_AGG(Sales ORDER BY Sales DESC LIMIT 3)  #but this line will not produce results in 3 separate columns
FROM A
GROUP BY Site, Category

Thanks in Advance!

user12345
  • 499
  • 1
  • 5
  • 21

2 Answers2

2

You could use array agg, but row_number() seem simple enough:

select store, category, avg(sales),
       max(case when seqnum = 1 then day end) as peakday1,
       max(case when seqnum = 2 then day end) as peakday2,
       max(case when seqnum = 3 then day end) as peakday3
from (select store, category, day, 
             row_number() over (partition by store, category order by sales desc) as seqnum
      from a
     ) a
group by store, category;

If you want to put this into an array column instead, you can use:

SELECT Store, Category, avg(Sales) as AvgSales,
       ARRAY_AGG(day ORDER BY Sales DESC LIMIT 3) 
FROM A
GROUP BY Site, Category
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

Below is for BigQuery Standard SQL

#standardSQL
SELECT Store, Category, AvgSales,
  Days[OFFSET(0)] PeakDay1,
  Days[SAFE_OFFSET(1)] PeakDay2,
  Days[SAFE_OFFSET(2)] PeakDay3
FROM (
  SELECT Store, Category, 
    ROUND(AVG(Sales), 2) AvgSales,
    ARRAY_AGG(Day ORDER BY Sales DESC LIMIT 3) Days
  FROM `project.dataset.table` t
  GROUP BY Store, Category
)   

If to apply to sample data from your question - output is

Row Store   Category    AvgSales    PeakDay1    PeakDay2    PeakDay3     
1   11      aaa         1.93        Friday      Thursday    Tuesday  
2   22      bbb         1.47        Friday      Tuesday     Monday   
Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230