I have two dataframes, df1
contains many values and df2
contains a few values that also appear in df1
. The values in df2
are ones that I want to delete from df1
.
I have tried to do this by merging, but that doesn't seem to have an option to only keep values that aren't in both.
I also tried using the code from this answer to a similar question.
This seemed to work, but it produced a dataframe with less values than I was expecting i.e. df1
contained 74911 values, df2
contained 767, and after removing them, there were 74064 remaining - so 80 additional rows were deleted. I'm not sure why this happened, if I could identify the 80 rows perhaps I could figure it out.
If anyone can think of an alternative way of achieving my goal I will be very grateful!
Here are some example dataframes, they are very simple compared to the real ones:
chrom <- c(1, 2, 3, 4)
pos <- c(2, 7, 9, 14)
seq_c <- c('A', 'G', 'C', 'T')
seq_k <- c('G', 'C', 'A', 'C')
df1 <- data.frame(chrom, pos, seq_c, seq_k)
chrom <- c(1, 2)
pos <- c(2, 7)
seq_c <- c('A', 'G')
seq_k <- c('G', 'C')
df2 <- data.frame(chrom, pos, seq_c, seq_k)
Expected output would then be:
chrom <- c(3, 4)
pos <- c(9, 14)
seq_c <- c('C', 'T')
seq_k <- c('A', 'C')
df3 <- data.frame(chrom, pos, seq_c, seq_k)