I'm trying to understand the example here which computes Jaccard similarity between pairs of vectors in a matrix.
val aBinary = adjacencyMatrix.binarizeAs[Double]
// intersectMat holds the size of the intersection of row(a)_i n row (b)_j
val intersectMat = aBinary * aBinary.transpose
val aSumVct = aBinary.sumColVectors
val bSumVct = aBinary.sumRowVectors
//Using zip to repeat the row and column vectors values on the right hand
//for all non-zeroes on the left hand matrix
val xMat = intersectMat.zip(aSumVct).mapValues( pair => pair._2 )
val yMat = intersectMat.zip(bSumVct).mapValues( pair => pair._2 )
Why does the last comment mention non-zero values? As far as I'm aware, the ._2
function selects the second element of a pair independent of the first element. At what point are (0, x)
pairs obliterated?