0

I am working with a table consisting of a number of web sessions with various events and event id:s. To simplify my question, let's say that I have 4 columns which are session_id, event_name and event_id, where the event id can be used to order the events in ascending/descending order. Let's also pretend that we have a large number of events and that I am particularly interest in 3 of the events with event_name: open, submit and decline. Assume that these 3 events can occur in any order.

What I would like to do, is that I would like to add a new column that for each session says which, if any, of the two events 'submit' and 'decline' that first follows the event 'open'. I have tried using the FIRST_VALUE partition function but have not made it successfully work yet.

So for a session with event sequence: 'open', ... (a number of different events happening in between), 'submit', 'decline', I would like to return 'submit', and for a session with event sequence: open, ... (a number of different events happening in between), 'decline', I would like to return 'decline', and for a sessions for which none of the events 'submit' nor 'decline' happens after 'open', I would like to return null.

You can use the following table with name 'events' for writing example SQL code:enter image description here

I hope the question and its formulation is clear. Thank you very much in advance!

Sincerely, Bertan

Berra
  • 35
  • 7

1 Answers1

2

Use below (assuming you have only one accept or decline per session!)

select *, if(event_name != 'open', null, ['decline', 'accept'][ordinal(
  sum(case event_name when 'decline' then 1 when 'accept' then 2 end) over win
  )]) staus
from your_table
window win as (
  partition by session_id order by event_id 
  rows between 1 following and unbounded following
)    

if apply to sample data in your question - output is

enter image description here

Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230