2

As I am working on a Q-Matrix question and needing to calculate hamming distance, I am having problem of different rows (as vectors) from dataframe A to the same row in dataframe B.

Specifically, df2 is a 6x3 dataframe and df1 is a 3x3 dataframe.

    df1 <- data.frame(Q1= c(1,0,1), Q2= c(0,0,0), Q3= c(1,1,0))
    df2 <- data.frame(A =c(1,1,0,1,1,0), B =c(0,0,0,1,1,0), C =c(0,1,0,1,1,1))

I need to apply all 6 rows from df2 as vectors into every row in df1 sequentially in order calculate the hamming distance of all 6 vector - 1 vector pair (it should output 3 hamming distance values only). For instance, for the 1st row in df1 which is (1,0,1) should be paired with all 6 rows in df2 including (1,0,0),(1,0,1),(0,0,0),(1,1,1),(1,1,1),(0,0,1)and produce 1 hamming distance value. Following the same logic, this similar process should give 3 hamming distance values.

I have tried a few times with double for() loop but with no success. In fact, I don't even know how we print out a row from a dataframe as a vector that could do other calculation with other vector. Can anyone please help?

Edward Lin
  • 609
  • 1
  • 9
  • 16

1 Answers1

1

This should work:

for(i in 1:nrow(df1)){
  print(sum(apply(df2,1,function(x) x!=df1[i,])))
  }

[1] 6
[1] 8
[1] 8
drJones
  • 1,233
  • 1
  • 16
  • 24