I am an R novice, using to attempt to analyze some demographic data for a plant species. My dataframe consists of:
TagKey (unique identifier), Year (observation year), TagEstablished (year the plant was first found), and StageClass (0=dead, 1=seedling, 2=vegetative, 3=reproductive). There is a row for each year the plant was visited, but I want 1 row per plant, then columns for its status each year. This is in order to track an individual's status from year to year.
example data:
TagKey <- c(PDPLM040J0_ALIFOR01_Belt_0, PDPLM040J0_ALIFOR01_Belt_0, PDPLM040J0_ALIFOR01_Belt_0, PDPLM040J0_ALIFOR01_Belt_1, PDPLM040J0_ALIFOR01_Belt_1, PDPLM040J0_ALIFOR01_Belt_1)
Year <- c(2020, 2020, 2020, 2021, 2021, 2021)
TagEstablished <- c(2020, 2020, 2020, 2020, 2020, 2020)
StageClass <- c(1, 2, 3, 0, 3, 3)
ALFO_stages <- data.frame(TagKey, Year, TagEstablished, StageClass)
I tried using ddply:
ALFO_status <- ddply(ALFO_stages, .(TagKey), dplyr::summarize,
Year_Established = TagEstablished,
Status2020 = if(Year=="2020") {StageClass},
Status2021 = if(Year=="2021") {StageClass})
My output does not group by TagKey as desired. The outputs are correct for their respective years, but the nonapplicable years just spit out NAs. Help?