3

I have some Abundance data for Observations of an insect at different dates and places. And I want to spread the data frame, so that i get one row for each individual insect, that was observed.

    set.seed(1234)
    df <- expand.grid(factor = c("A", "B"),
        date = seq(as.Date("2019-05-04"), as.Date("2019-05-08"),"day"))
    df$Abundance <- sample(seq(3,10,1), nrow(df), replace = T)

What I have is:

    factor       date Abundance
    1       A 2019-05-04         3
    2       B 2019-05-04         7
    3       A 2019-05-05         7
    4       B 2019-05-05         7
    5       A 2019-05-06         9
    6       B 2019-05-06         8
    7       A 2019-05-07         3
    8       B 2019-05-07         4
    9       A 2019-05-08         8
    10      B 2019-05-08         7

And now I want to transform the data frame, that it looks like that:

     factor       date  Abundance
    1       A 2019-05-04         1
    2       A 2019-05-04         1
    3       A 2019-05-04         1
    4       B 2019-05-04         1
    5       B 2019-05-04         1
    6       B 2019-05-04         1
    7       B 2019-05-04         1
    8       B 2019-05-04         1
    9       B 2019-05-04         1
    10       B 2019-05-04         1

    ...

Does anybody know how to do that with dplyr?

Thanks for your help!

Pharcyde
  • 397
  • 1
  • 3
  • 14

2 Answers2

1

We can use uncount from tidyr

library(tidyverse)
uncount(df, Abundance) %>%
       mutate(Abundance = 1) 
akrun
  • 874,273
  • 37
  • 540
  • 662
0

You could use rep and slice, where we repeat every row Abundance number of times.

library(dplyr)

df %>%
  slice(rep(1:n(), Abundance)) %>%
  mutate(Abundance = 1)


#   factor       date Abundance
#1       A 2019-05-04         1
#2       A 2019-05-04         1
#3       A 2019-05-04         1
#4       B 2019-05-04         1
#5       B 2019-05-04         1
#6       B 2019-05-04         1
#7       B 2019-05-04         1
#8       B 2019-05-04         1
#9       B 2019-05-04         1
#10      B 2019-05-04         1
#....

Same using base R would be

transform(df[rep(1:nrow(df), df$Abundance), ], Abundance = 1)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213