I have df
below which I have sorted according to the ID
variable and the time variable T
as the secondary sort.
df = pd.DataFrame({
'ID': ['a', 'b', 'c', 'b', 'b'],
'T': [
datetime.datetime(2019, 1, 1),
datetime.datetime(2017, 1, 1),
datetime.datetime(2018, 1, 1),
datetime.datetime(2020, 1, 1),
datetime.datetime(2021, 1, 1)],
'V': [3, 5, 8, 6, 1]
}).sort_values(['ID', 'T'], ascending=False)
df
ID T V
2 c 2018-01-01 8
4 b 2021-01-01 1
3 b 2020-01-01 6
1 b 2017-01-01 5
0 a 2019-01-01 3
I want to add a new column V_L
where, for each ID
, the last value (based on the time column T
) is shown. If there is no last value this should be indicated by a null
value in V_L
. An example output would look like this:
df
ID T V V_L
0 a 2018-01-01 8 NaN
1 b 2021-01-01 1 6.0
2 b 2020-01-01 6 5.0
3 b 2017-01-01 5 NaN
4 c 2019-01-01 3 NaN