I have a dataframe
id author publication_year article_years
1 John Doe 2000 21
1 John Doe 2010 11
2 John Foo 2015 6
2 John Foo 1980 31
3 John Lee 2020 1
3 John Lee 2019 2
I want to create a new column - activity_years
, where the max value from article_years
will be counted as the total years of activity. Basically, if the author published his article in 1980 for the first time, his activity is 31 since his first publication
Expected output
id author publication_year article_years activity_years
1 John Doe 2000 21 21
1 John Doe 2010 11 21
2 John Foo 2015 6 31
2 John Foo 1980 31 31
3 John Lee 2020 1 2
3 John Lee 2019 2 2