I have the following Dataframe:
---------+--------+----------+-----------+--------------------+--------------------+-------+-----+------------
| id|groupid|| field| oldstring| newstring| created| pkey| project
+-------+-------+---------+--------------------+--------+-------------+--------+-------------+-------+-------+
|1451923| 594128| Team1| [RLA N1]| [N1-UO-SB]| 2013-03-29 13:31:...|DSTECH-55770| 10120|
|1451958| 594140| Team1| [SEP N2]| [SEP N2]| 2013-03-29 13:34:...|DSTECH-56998| 10120|
|1452282| 594308| Team1| [N1-UO-SE]| [SEP N2]| 2013-03-29 14:09:...|DSTECH-57900| 10120|
|1492252| 610736| Team1| [N1-UO-SE]| [SEP N2]| 2013-04-17 08:48:...|DSTECH-59560| 10120|
|5105082|2304145| Team1| [Aucun]|[SEP-SUPPORT]| 2017-09-01 09:46:...| ECO-9781| 10280|
|5105084|2304145| Team2| null| SEP-SUPPORT| 2017-09-01 09:46:...| ECO-9781| 10280|
|5105084|2304145| Team1| [ISR N2]| SEP-SUPPORT | 2013-03-29 13:31:... |DSTECH-57895| 10120|
|1451926|594129 | Team1| [N1-UO-SE]| [ISR N2] |2013-03-29 13:55:... |DSTECH-57895| 10120|
|1452182|594273 | Team1| [N1-UO-SE]| [SEPN1-ENV] |2013-03-29 13:43:... |DSTECH-57895| 10120|
I would like compute the treatment date/time of the [pkey]
For exemple, I have these two lines:
| id|groupid|| field| oldstring| newstring| created| pkey|
+-------+-------+---------+--------------------+--------+-------------+--------+-------------+-------+-------+
|1451923| 594128| Team1| [RLA N1]| [N1-UO-SB]| 2013-03-29 13:31:...|DSTECH-55770|
|1451958| 594140| Team1| [SEP N2]| [SEP N2]| 2013-03-29 13:34:...|DSTECH-56998|
The treatment date/time of [DSTECH-55770] = [2013-03-29 13:34:...] - [2013-03-29 13:31:...]
How can I to compute this difference with the previous date, I found that I can do it using User Defined Aggregating Functions UDAF. But I do not if this solution is useful to display the difference between two dates in a number (for example: 8h:30min), I do not mean by 8H is at clock 8H but number of hours is 8.
If someone can help me how can I do it using UDAF or if you have another solution ? Thank you