0

I have the following transaction table:

customer_id, category, product_id, score

I'm grouping by customer_id and category to create a list of product_id-score map pairs:

SELECT
    s.customer_id,    
    s.category,
    collect_list(s.pair) 
FROM
    (
        SELECT
            customer_id,
            category,
            map(product_id, score) AS pair
        FROM
            transaction
        WHERE
            score > {score_threshold}
    ) s 
GROUP BY
    s.customer_id,
    s.category

Now I want to take this a step further. For each group, I'm looking to retain just the top n pairs, sorted by score (descending order).I tried the OVER (PARTITION BY...ORDER BY) by I'm running into problems.

Note: the transaction table is partitioned by category

Thanks

alhazen
  • 1,907
  • 3
  • 22
  • 43

1 Answers1

1

Try this:

SELECT
        s.customer_id,    
        s.category,
        collect_list(s.pair) 
    FROM
        (
            SELECT
                ROW_NUMBER() OVER (PARTITION BY customer_id, category ORDER BY score desc) as RowId 
                customer_id,
                category,
                map(product_id, score) AS pair
            FROM
                transaction
            WHERE
                score > {score_threshold}
        ) s 
        where s.RowId < n
    
    GROUP BY
        s.customer_id,
        s.category
mrivanlima
  • 561
  • 4
  • 10