2x2 contingency matrix:
Cj
2 1
Ci
1 0
Translates to:
[[ 0 0 0 1 ]
[ 0 0 1 0 ]]
The contingency matrix represents the outcome of two clustering algorithms, each with two clusters. The first row indicates that Ci
has three data points in, say, cluster 1 and one data point in, say, cluster 2. Cj
has three data points in, say, cluster A and 1 data point in, say, cluster B. Therefore, both algorithms "agree" on two out of N = 4 data points.
Since there does not exist an adjusted mutual information function that takes in the contingency matrix as input, I would like to transform the contingency matrix to 1d inputs for the sklearn implementation of AMI.
Is there an efficient way to re-write a NxN contingency matrix in 1D vector form in Python code?
It would look something like:
V1
V2
For i row index
For j column index
Append as many as contingency_ij elements with value i to V1 and with value j to V2
The output should always be two vectors. Another example:
2 0 0
0 1 0
0 0 1
Would lead to two 1D vectors:
0 0 1 2
0 0 1 2