-1

I am working with taxonomic data, and want to filter my data in order to make a more precise graph. I am working with Family level data, and need to make a code that filters for all Families that appear more than 100 times in the data. I want this to be my y-axis information (number of appearances) and then x-axis to be the Family name of the species. I have the physical graph figured out, but the filtration still isn't working. I am working with ggpplot geom_bar. I need to make a code that counts all the words in the Family column, and only includes the ones that appear more than 100 times. Is this possible?

Emily
  • 1
  • 1
    Please show the code you are trying, you will usually learn more from us helping you overcome your mistakes rather than just seeing a working example. – Gregor Thomas Mar 06 '19 at 19:34

1 Answers1

1

Here is an example from the diamonds dataset:

library(tidyverse)
diamonds %>%
group_by(color) %>% 
count() %>% 
filter(n>99) %>% 
print() %>% 
ggplot()+geom_point(aes(x=color, y = n))
user2292410
  • 447
  • 4
  • 13
  • 1
    you can replace `group_by(color) %>% count()` by just `count(color)`, since `count()` calls `group_by()` before and `ungroup()` after (https://dplyr.tidyverse.org/reference/tally.html) – i94pxoe Mar 08 '19 at 07:21