How can I keep those rows in one data frame (df1
) that I have identified in a second data frame (keep_sites
)? In the example dataset below, I have three data variables (Data1
,Data2
,Data3
) associated with four different sites (Site
). I would like to keep all the rows in df1
for only those sites in the keep_sites
data frame.
Example Dataset:
df1 <- data.frame(matrix(ncol = 4, nrow = 12))
x <- c("Site","Data1","Data2","Data3")
colnames(df1) <- x
df1$Site <- rep(c("A","B","C","D"),3)
set.seed(99)
df1$Data1 <- rnorm(12,4,1)
df1$Data2 <- rnorm(12,16,2)
df1$Data3 <- rnorm(12,32,4)
df1[order(df1$Site, decreasing = FALSE),]
keep_sites <- data.frame(matrix(ncol = 1, nrow = 2))
y <- "Site"
colnames(keep_sites)[1] <- y
keep_sites[1,1] <- "A"
keep_sites[2,1] <- "C"
I have tried this but it only returns those rows associated with the first site (site A
) in keep_sites
:
df2 <- df1[df1$Site == keep_sites$Site,]
The correct output should look like this:
Site Data1 Data2 Data3
1 A 4.213963 17.500109 32.89067
5 A 3.637162 15.211962 26.53026
9 A 3.635883 18.197843 31.41482
3 C 4.087829 9.918132 34.73457
7 C 3.136155 16.997263 37.49222
11 C 3.254231 15.881167 22.82112