Extracting the observations of two species from the iris dataset in R

Question

The iris dataset contains 150 observations of three plant species (setosa, versicolor and virginica), being 50 observations of each species. I would like to create a new dataframe, called "a", containing only the observations of two of these species (setosa and versicolor). I have been trying to use the codes below to do this, but these codes apparently do some sort of cycling; that is, instead of returning observations from 1 to 100 (which is what I'd like), they return observations 1, 3, 5, 7, ..., 100.

data(iris)

a <- subset(iris, Species == c("setosa", "versicolor"))

or

a <- iris[iris$Species == c("setosa","versicolor"),]

I would be grateful if anyone can help me figure out what I am doing wrong. I am aware that there are much simpler ways to get the dataframe I want (e.g., the codes listed below), but I would really like to understand why the above codes do not work — I want to apply this to more complex datasets where I have to extract many species and I would like to extract them by calling them by name.

a <- iris[1:100,] # this returns the dataframe I want

# or

a <- subset(iris, Species != "virginica") # this returns the dataframe I want as well

You were trying to find all the items that "==" both of those values. Since each of them had only one value you got nothing. Could also have succeeded with `Species == "setosa" | Species "=="versicolor"`, but the simplest substitution for `"=="`was `%in%`. — IRTFM, Jun 12 '22 at 18:59

score 1 · Answer 1 · answered Jun 12 '22 at 18:47

library(tidyverse)

iris %>%  
  filter(Species %in% c("setosa", "versicolor"))

# A tibble: 100 x 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          <dbl>       <dbl>        <dbl>       <dbl> <fct>  
 1          5.1         3.5          1.4         0.2 setosa 
 2          4.9         3            1.4         0.2 setosa 
 3          4.7         3.2          1.3         0.2 setosa 
 4          4.6         3.1          1.5         0.2 setosa 
 5          5           3.6          1.4         0.2 setosa 
 6          5.4         3.9          1.7         0.4 setosa 
 7          4.6         3.4          1.4         0.3 setosa 
 8          5           3.4          1.5         0.2 setosa 
 9          4.4         2.9          1.4         0.2 setosa 
10          4.9         3.1          1.5         0.1 setosa 
# ... with 90 more rows

Extracting the observations of two species from the iris dataset in R

1 Answers1