0

I have a table that doesn't have a primary key and is partitioned by date; columns like this:

1. user_id  
2. device
3. region
4. datetime
5. and other columns

It contains user generated events from a website game, they trigger every second. I want to return a batch with all the events (including duplicate rows) generated by the first 6 users (top of the table) in the current day that check the conditions:

for region = US

- one user from iOS
- one user from android
- one user from PC

for region = EU

- one user from iOS
- one user from android
- one user from PC

Can you provide a sample code from where I should start? A friend of mine suggested something about RANK() but I never used it.

Thank you !

SELECT * FROM 
    (SELECT user_id, 
    event_post_time, 
    device, 
    region, 
    COUNT(DISTINCT player_id) over (partition by player_id) as ct_pid, 
    COUNT(DISTINCT region) over (partition by region) as ct_region, 
    COUNT(DISTINCT device) over (partition by device) as ct_device 
    FROM events 
    WHERE event_post_time = current_date() 
    AND region IN ('EU','US') 
    AND device IN ('ios','android','pc')) e 
WHERE ct_pid <= 6 
AND ct_region <= 2 
AND ct_device <= 3 
ORDER BY player_id

Adding dummy data at SQLFiddle and expected output:

user_id device region date_generated
  1  ios  EU  22-05-18
  1  ios  EU  22-05-18
  1  ios  EU  22-05-18
  4  ios  US  22-05-18
  4  ios  US  22-05-18
  2  android  EU  22-05-18
  2  android  US  22-05-18
  4  pc  EU  22-05-18
  4  pc  EU  22-05-18
  4  pc  EU  22-05-18
  5  pc  US  22-05-18
Andrew O
  • 13
  • 5
  • Do you need basic sample like, SELECT * FROM ? If you have the basic figured out but just can't figure out the ranking, throw that code in the question. It will be easier to help. – sniperd May 21 '18 at 19:21
  • Yes, the sample should be SELECT * – Andrew O May 21 '18 at 19:29
  • @sniperd SELECT * FROM (SELECT user_id, event_post_time, device, region, COUNT(DISTINCT player_id) over (partition by player_id) as ct_pid, COUNT(DISTINCT region) over (partition by region) as ct_region, COUNT(DISTINCT device) over (partition by device) as ct_device FROM events WHERE event_post_time = current_date() AND region IN ('EU','US') AND device IN ('ios','android','pc')) e WHERE ct_pid <= 6 AND ct_region <= 2 AND ct_device <= 3 ORDER BY player_id – Andrew O May 21 '18 at 20:02

1 Answers1

0

Probably, this is what you are looking for.

select * from (
select rank() over (partition by region,device order by cn desc) as
top_num,player_id, region,device,cn from 
(
select count(*) as cn , player_id,region,device from 
test_table group by player_id,region,device 
)l
)t 
where top_num = 1;

Let me know if this helps.

OP EDIT: I managed to make it work for what I wanted using your provided query; here is the final one

WITH combo 
 AS (SELECT user_id, 
            region, 
            device 
     FROM   (SELECT Rank() 
                      OVER ( 
                        partition BY region, device 
                        ORDER BY cn DESC) AS top_num, 
                    user_id, 
                    region, 
                    device, 
                    cn 
             FROM   (SELECT Count(*) AS cn, 
                            user_id, 
                            region, 
                            device 
                     FROM   samples 
                     GROUP  BY user_id, 
                               region, 
                               device)l)t 
     WHERE  top_num = 1) 
SELECT s.user_id, 
   s.region, 
   s.device 
FROM   samples s 
   JOIN combo 
     ON s.user_id = combo.user_id 
        AND s.region = combo.region 
        AND s.device = combo.device 
Andrew O
  • 13
  • 5
Batman
  • 50
  • 8
  • Thank you for your reply. I tried your code and it close to what I am looking for. Here is a link to [SQL Fiddle](http://sqlfiddle.com/#!9/b53e1b) that has some dummy data. The following output is what I need: user_id device region date_generated 1 ios EU 22-05-18 1 ios EU 22-05-18 1 ios EU 22-05-18 4 ios US 22-05-18 4 ios US 22-05-18 2 android EU 22-05-18 2 android US 22-05-18 4 pc EU 22-05-18 4 pc EU 22-05-18 4 pc EU 22-05-18 5 pc US 22-05-18 – Andrew O May 23 '18 at 16:45
  • The logic is good, but I need also all other rows for a combination. Check the original question as I attached the expected output. Thank you for replying btw, I will try to adapt what you gave me so maybe I can make work. – Andrew O May 24 '18 at 18:32