9

I have an events-based table in Redshift. I want to tie all events to the FIRST event in the series, provided that event was in the N-hours preceding this event.

If all I cared about was the very first row, I'd simply do:

SELECT
   event_time
   ,first_value(event_time) 
      OVER (ORDER BY event_time rows unbounded preceding) as first_time
FROM
   my_table

But because I only want to tie this to the first event in the past N-hours, I want something like:

SELECT
   event_time
   ,first_value(event_time) 
       OVER (ORDER BY event_time rows between [N-hours ago] and current row) as first_time
FROM
   my_table

A little background on my table. It's user actions, so effectively a user jumps on, performs 1-100 actions, and then leaves. Most users are 1-10x per day. Sessions rarely last over an hour, so I could set N=1.

If I just set a PARTITION BY date_trunc('hour', event_time), I'll double create for sessions that span the hour.

Assume my_table looks like

id | user_id | event_time
----------------------------------
 1 |   123   | 2015-01-01 01:00:00
 2 |   123   | 2015-01-01 01:15:00
 3 |   123   | 2015-01-01 02:05:00
 4 |   123   | 2015-01-01 13:10:00
 5 |   123   | 2015-01-01 13:20:00
 6 |   123   | 2015-01-01 13:30:00

My goal is to get a result that looks like

id | parent_id | user_id | event_time
----------------------------------
 1 |   1       |  123    | 2015-01-01 01:00:00
 2 |   1       |  123    | 2015-01-01 01:15:00
 3 |   1       |  123    | 2015-01-01 02:05:00
 4 |   4       |  123    | 2015-01-01 13:10:00
 5 |   4       |  123    | 2015-01-01 13:20:00
 6 |   4       |  123    | 2015-01-01 13:30:00
ScottieB
  • 3,958
  • 6
  • 42
  • 60
  • So your question is *not* how to put a condition on a window function but how to calculate sessions with postges, right? That's answered. Can't find it right now. – usr Sep 23 '15 at 23:35
  • Can you just post the sample data of table and their relations? Also how do you need the data? – Viki888 Sep 30 '15 at 14:53

1 Answers1

6

The answer appears to be "no" as of now.

There is a functionality in SQL Server of using RANGE instead of ROWS in the frame. This allows the query to compare values to the current row's value.

https://www.simple-talk.com/sql/learn-sql-server/window-functions-in-sql-server-part-2-the-frame/

When I attempt this syntax in Redshift I get the error that "Range is not yet supported"

Someone update this when that "yet" changes!

ScottieB
  • 3,958
  • 6
  • 42
  • 60