Please see below a simplified data set, which is on a country-year basis:
country <- c("CountryA", "CountryA", "CountryA", "CountryA",
"CountryB", "CountryB", "CountryB", "CountryB",
"CountryC", "CountryC", "CountryC", "CountryC")
year <- c(2001, 2002, 2003, 2004,
2001, 2002, 2003, 2004,
2001, 2002, 2003, 2004)
v1 <- c(2, 3, 5, 4, 3, 3, 1, 2, 1, 4, 3, 2)
df1 <- data.frame(country, year, v1)
df1
country year v1
CountryA 2001 2
CountryA 2002 3
CountryA 2003 5
CountryA 2004 4
CountryB 2001 3
CountryB 2002 3
CountryB 2003 1
CountryB 2004 2
CountryC 2001 1
CountryC 2002 4
CountryC 2003 3
CountryC 2004 2
My question is:
How can I write a code that creates an incident-based subset of the above data set like the one below:
cntry <- c("CountryA", "CountryB", "CountryC")
stYear <- c(2001, 2002, 2003)
endYear <- c(2003, 2004, 2003)
v1Max <- c(5, 3, 3)
v1Ave <- c(3.33, 2, 3)
df2 <- data.frame(cntry, year, v1)
df2
cntry stYear endYear v1Max v1Ave
CountryA 2001 2003 5 3.33
CountryB 2002 2004 3 2
CountryC 2003 2003 3 3
In other words, I need to code each incident separately into a new data frame. (For example, the first line in df2 above is the incident in CountryA from 2001 to 2003.) While doing this, I need to also recode the values within the corresponding time frame. (For example, v1Max in df2 is the maximum value v1 takes in df1 for the duration of the incident. Similarly, v1Ave in df2 is the average.)
If you can provide me with a code that performs the above transformation from df1 to df2, I can then enhance it to solve my problem.
Thanks!