0

I'm looking to create a cumulative curve of species over time (not species accumulation in vegan) but to create a curve that will show the total number of unique species added over time. An example of my data frame looks like this:

Year   Phylum              SpeciesName
1861  Mollusca        Littorina littorea
1862  Cnidaria        Gersemia rubiformis
1862  Rhodophyta         Ceramium virgatum
1863  Mollusca        Littorina littorea
1863  Chlorophyta        Ulva clathrata
etc etc etc

I would like to aggregate to a dataframe that looks like this

Year   Cumulative
1861       1
1862       3
1863       4

Littorina littorea was already found in 1861 and therefor its entry in 1863 is not counted in the cumulative number. I cant figure out how to streamline this. Here is what I've tried

data %>% group_by(Year, Phylum) %>% summarise(Count=n_distinct(Species)) %>% ungroup() %>% mutate(Cum=cumsum(Count)) which would give me:

Year   Phylum        Count       Cumulative
    1861  Mollusca      1             1
    1862  Cnidaria      1             2
    1862  Rhodophyta    1             3
    1863  Mollusca      1             4
    1863  Chlorophyta   1             5

However, this just aggregates all the unique species per phylum and adds them, not accounting for the fact that a species may have already showed up in years before. I just cant seem to figure out which way I should actually aggregate the unique values over time. Thanks!

0 Answers0