1

I want to use the hamming distance in kmeans clustering in Matlab, but I get an error saying that my data must be binary.

Is there anyway around this? The data matrix that I use can't be binary (it has a physical interpretation that must allow for values 0,1,2,3) but it's important that I use the Hamming distance.

Qiu
  • 5,651
  • 10
  • 49
  • 56

2 Answers2

1

The data to cluster must be of type logical. You can convert your 0/1 double, single, uintX data by a single command:

x = logical( y );

If you want to convert uint8 type data to binary, check the function uint8tobit(). Take a look at de2bi() and bi2de() functions.

Vladislavs Dovgalecs
  • 1,525
  • 2
  • 16
  • 26
1

Per the MATLAB documentation, the Hamming distance measure for kmeans can only be used with binary data, as it's a measure of the percentage of bits that differ.

You could try mapping your data into a binary representation before using the function. You could also look at using the city block distance as an alternative if possible, as it is suitable for non-binary input.

goric
  • 11,491
  • 7
  • 53
  • 69
  • Thanks, I mapped the distance matrix into a binary representation using de2bi in matlab. This enables me to use the hamming distance with kmeans but now the distance matrix has a different size, resulting in more elements in my clustering. I can't use the taxicab distance, not suitable for my interpretation of my original matrix... – Thinus Viljoen Feb 21 '12 at 19:17