0

My Spark DataFrame contains the following data:

 user_id | id | timestamp
---------|----|-------------------
 123     | 2  | 2018-10-12 9:25:30
 123     | 3  | 2018-10-12 9:27:20
 123     | 4  | 2018-10-12 9:45:15
 123     | 5  | 2018-10-12 9:47:40
 234     | 6  | 2018-10-12 9:26:32
 234     | 7  | 2018-10-12 9:28:21
 234     | 8  | 2018-10-12 9:46:16
 234     | 9  | 2018-10-12 9:48:43

I need to count how many records each user has with time difference less than 15 min. The result should look like this:

 user_id | count | window
---------|-------|----------------------------------------
 123     | 2     | 2018-10-12 9:25:30 - 2018-10-12 9:27:20
 123     | 2     | 2018-10-12 9:45:15 - 2018-10-12 9:47:40
 234     | 2     | 2018-10-12 9:26:32 - 2018-10-12 9:28:21
 234     | 2     | 2018-10-12 9:46:16 - 2018-10-12 9:48:43
Igorock
  • 2,691
  • 6
  • 28
  • 39
  • What is result of a seq of timestamp: – sadhen Oct 13 '18 at 07:44
  • 9:20, 9:21, 9:22, ..., 9:50, 9:51, 9:52 please define it – sadhen Oct 13 '18 at 07:45
  • sorry, don't understand your question – Igorock Oct 13 '18 at 07:45
  • http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#window-operations-on-event-time may be helpful for you. You must define your window logic. – sadhen Oct 13 '18 at 07:48
  • I mean, a timestamp may be counted in multiple windows(intervals) – sadhen Oct 13 '18 at 07:49
  • thanks, but I need my custom windows intervals based on first and last record in a group with time difference less then 15 min. My window length could be for example 1 hour if no gap greater than 15m. Spark window function creates fixed 15m intervals. – Igorock Oct 13 '18 at 08:09
  • @Igorock can you elaborate more on "if no gap greater than 15m" in your previous comment? and please provide an example for it – vdep Oct 13 '18 at 08:29
  • If I have messages with timestamps 9:05 and 9:07 the gap is 2 mins, so they should go to one window. If I have third message with timestamp 9:27 the gap with previuos message is 20 min so it goes to a new window. – Igorock Oct 13 '18 at 09:01

0 Answers0