6

I have two dataframes:

df1 <- data.frame(cola = c("dum1", "dum2", "dum3"), colb = c("bum1", "bum2", "bum3"), colc = c("cum1", "cum2", "cum3"))

and:

df2 <- data.frame(cola = c("dum1", "dum2", "dum4"), colb = c("bum1", "bum2", "bum3"))

I need to find the indices of the rows in dataframe df1 in which the columns cola and colb are the same, here it would be row 1 and row 2. I know the inner_join function from the dplyr package but this results in a new dataframe. I just need a vector with the indices. I could do this with which for each column needed but this would be ugly if I need to find common rows based on a large number of columns.

Any help is much appreciated.

Cactus
  • 864
  • 1
  • 17
  • 44
  • 1
    Related: [How do I tag rows with two variables that match rows in a second data frame? R](https://stackoverflow.com/questions/24809225/how-do-i-tag-rows-with-two-variables-that-match-rows-in-a-second-data-frame-r), where output is logical instead of index. – Henrik Jan 08 '18 at 12:10

2 Answers2

5

The more general typical way of solving this would look like:

colsToUse <- intersect(colnames(df1), colnames(df2))
match(do.call("paste", df1[, colsToUse]), do.call("paste", df2[, colsToUse]))

[1] 1 2 NA

RolandASc
  • 3,863
  • 1
  • 11
  • 30
  • I decided in the end to go with `which(do.call(paste, s.opts1[, c(2:3)]) == do.call(paste, dm.rw1))`but the `do.call`did the trick. So thank you a lot! – Cactus Jan 08 '18 at 13:46
1

Just do

 which(apply(df1[1:2]==df2,1,prod)==1)
Onyambu
  • 67,392
  • 3
  • 24
  • 53
  • Thank you, but this gives an error that levels of factors are not the same: `Error in Ops.factor(left, right)` – Cactus Jan 08 '18 at 13:38
  • Ensure your dataframe elements are characters instead of factors. ie `df1=data.frame(...,stringsAsFactors=FALSE)` – Onyambu Jan 08 '18 at 13:40
  • This helps, but I do not want to have to convert it to characters before and find the `do.call` solution from RolandASc more direct. Thank you nevertheless. – Cactus Jan 08 '18 at 13:47