I have events
table in clickhouse.
When some user (defined by user_id) come into the room (defined by object_id) or left the room electonic lock opened by key-card should be opened so for each interaction with the lock there are a record in events
table with corresponding event_type
set to "came" or "left".
CREATE TABLE default.events
(
`user_id` UInt32, -- user of event
`object_id` UInt32, -- object of event
`event_type` String, -- type of event, can be "came" or "left"
`event_moment_date` Date, -- date of event
`event_moment` DateTime -- datetime of event
)
ENGINE = MergeTree(
event_moment_date,
(
object_id,
user_id,
event_moment
),
8192)
I need to output data in form:
user_id, -- user
object_id, -- object
came_moment, -- moment then user entered the room
left_moment -- moment then user leaved the room
So, for every "came" event I need to find matching "left" event that has equal user_id and object_id and has event_datetime in the nearest possible future relatively to "came" event_moment.
I can easily do this in MySQL/Postgrsql with subquery like this:
SELECT
s1.object_id,
s1.user_id,
s1.action_moment as "came_moment",
(
select s2.action_moment from source as s2
where
s1.user_id = s2.user_id
and
s1.object_id = s2.object_id
and
s1.action_moment < s2.action_moment
and
s2.action_type = 'left'
order by s2.action_moment asc
limit 1
) as "left_moment"
FROM
source as s1
where s1.action_type = 'came'
But Clickhouse doens't allow to use outer query fields in subquery (dependent subquery).
I've also tried using JOINs but Clickhouse doens't allow to use "<" or ">" in ON statement of the JOIN. Also I've tried to use neighbour
function of Clickhouse, with order by event_moment but i've achieved only selection of a single row with specific came_moment
and corresponding left_moment
by ordering data and selecting next neighbour row, it seems that I cannot load the full list of this kind of rows.
I starting to think that my task is impossible in ClickHouse, and I need some kind of script (php/python/whatever) which would scan the database and add some "session_id" field, which should has equal id number for every two rows which are corresponding to each other "came" and "left".