Value from previous row in GROUP BY as column

Question

I have this table:

+----------+-------------+-------------------+------------------+
|    userId|       testId|               date|              note|
+----------+-------------+-------------------+------------------+
| 123123123|            1|2019-01-22 02:03:00|               aaa|
| 123123123|            1|2019-02-22 02:03:00|               bbb|
| 123456789|            2|2019-03-23 02:03:00|               ccc|
| 123456789|            2|2019-04-23 02:03:00|               ddd|
| 321321321|            3|2019-05-23 02:03:00|               eee|
+----------+-------------+-------------------+------------------+

Would like to get newest note (whole row) for each group userId and testId:

SELECT
    n.userId,
    n.testId,
    n.date,
    n.note
FROM 
    notes n
INNER JOIN (
    SELECT 
        userId,
        testId,
        MAX(date) as maxDate
    FROM 
        notes
    GROUP BY 
        userId,
        testId
) temp ON n.userId = temp.userId AND n.testId = temp.testId AND n.date = temp.maxDate

It works.

But now I'd like to also have previous note in each row:

+----------+-------------+-------------------+-------------+------------+
|    userId|       testId|               date|         note|previousNote|
+----------+-------------+-------------------+-------------+------------+
| 123123123|            1|2019-02-22 02:03:00|          bbb|         aaa|
| 123456789|            2|2019-04-23 02:03:00|          ddd|         ccc|
| 321321321|            3|2019-05-23 02:03:00|          eee|        null|
+----------+-------------+-------------------+-------------+------------+

Have no idea how to do it. I heard about LAG() function which might be useful but found no good examples for my case.

I'd like to use it on dataframe in pyspark (if it's important)

if there will be more than two records with same notes than all previous records should be in single column or different column. @robin71 — , Jun 03 '19 at 07:28
@LovePandey I need only one previous value, so it'll always be one column. — Robin71, Jun 03 '19 at 07:31
i thing you are looking such answer which i have given you below. please check out my answer and suggest me. — , Jun 03 '19 at 09:28

Zaynul Abadin Tuhin · Accepted Answer · 2019-06-03T08:17:17.187

2

use lag() and row_number analytic function

select userid,testid,date,note,previous_note
from
(select userid,testid,date,note,
lag(note)over(partition by useid,testid order by date) as previous_note,
row_number() over(partition by userid,testid order by date desc) rn
from table_name
) a where a.rn=1

edited Jun 03 '19 at 08:17

answered Jun 03 '19 at 07:12

Zaynul Abadin Tuhin

31,407
5
33
63

Thanks for answer. There is something wrong because query returns only nulls in `previous_note` column – Robin71 Jun 03 '19 at 08:02
OK, I replaced lead() with lag() and it looks it works. Please, edit you answer if you agree. – Robin71 Jun 03 '19 at 08:08
i think it is not working well. please check once again - @Robin71 – Jun 03 '19 at 09:30

score 1 · Answer 2 · answered Jun 03 '19 at 09:27

select userid,testid,date,note,previous_note from
(select userid,testid,date,note,lead(note)
over(partition by userid,testid order by date desc) as previous_note,
row_number() over(partition by userid,testid order by date desc) srno
from Table_Name
) a where a.srno=1

I hope it will give you right answer which you want. it will give you latest date as new record and previous date note as previous_Note.

Value from previous row in GROUP BY as column

2 Answers2