How do a select a group based on the count (minimal sample size)?

Question

I was handed over a nice data set (reds) examining acorn morphology and parasites (weevils) for many trees over several years. However, the sample sizes for each tree are highly variable (5 - 75 acorns/tree). I'm going to set a minimum of 20 acorns for a tree/year combo to enter the data set that will be analyzed.

How do a select a group based on that group's (tree.id) count for any year (year)?

Happy to work with dplyr but I'm not sure how to create a data set with that filter using dplyr

What I have so far

  group_by(tree.id) %>%
  filter(n() >=20)

Thanks,

Jeff

2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 
2012L, 2012L, 2012L, 2012L, 2012L), tree.id = c(45L, 87L, 87L, 
87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 
87L, 87L, 205L, 87L), species = c("RO", "RO", "RO", "RO", "RO", 
"RO", "RO", "RO", "RO", "RO", "RO", "RO", "RO", "RO", "RO", "RO", 
"RO", "RO", "RO", "RO"), germination = c(0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), 
    seed.mass = c(5.305, 5.152, 5.225, 7.684, 6.902, 7.809, 8.48, 
    3.606, 6.541, 8.531, 8.233, 6.284, 6.855, 3.33, 7.628, 7.778, 
    5.955, 5.332, 2.358, 7.617)), row.names = c(NA, 20L), class = "data.frame")

You've almost got it. You need to start with the data frame name and if you want at least 20 rows per tree.id per year then, assuming your data frame is called `df` you can do:`df %>% group_by(tree.id, year) %>% filter(n() >= 20)`. — eipi10, Aug 16 '20 at 23:04

How do a select a group based on the count (minimal sample size)?

0 Answers0