1

I want to filter records based on on the time interval but I have timestamps. I have a table 'df' with the following column names:

id - int time - timestamp correctness - boolean subject - text

Every student (id) completes tasks on particular subject (subject). The system assigns "True" value in column "correctness" if the assignment is completed correctly and "False" if not. The time (time) when the student completes the task is also saved by the system.

I need to write an sql query that counts all students who completed 20 tasks successfully within an hour during March 2020. I need to count those who completed the tasks during the hour from the actual time they started and actual time the finished.

Thanks in advance!

Alex
  • 179
  • 10
  • I need to count those who completed the tasks during the hour from the actual time they started and actual time the finished. 'Date_trunc' does not provide for this – Alex Sep 15 '20 at 17:04

1 Answers1

1

You could use window functions and a range frame:

select distinct id
from (
    select 
        t.*, 
        count(*) filter(where correctness) over(
            partition by id 
            order by time 
            range between interval '1 hour' preceding and current row
        ) cnt
    from mytable t
    where time >= date '2020-03-01' and time < date '2020-04-01'
) t
where cnt > 20

The window function counts how many tasks where successfully performed by the same user within the last hour; you can then use this information to filter the resultset.

This gives you a list of users that satisfy the condition. If you want the count of such user, then replace select distinct id with select count(distinct id).

GMB
  • 216,147
  • 25
  • 84
  • 135
  • Thank you but I need to count those who completed the tasks during the hour from the actual time they started and actual time they finished. – Alex Sep 15 '20 at 17:50
  • 1
    @Alex: that’s what the query does... Did you try it? – GMB Sep 15 '20 at 18:07
  • Yes, it worked but correctness didn't. I modified correctness= 'True' but it still includes observations with both 'True' and 'False' values. – Alex Sep 15 '20 at 18:15
  • @Alex: you did not specify what to do with negatives. You said you wanted users and that have at least 20 positives in one hour - which the query does. – GMB Sep 15 '20 at 23:26