0

I've a db table with a varchar(64) field to store PHashing data, as 64 chars (1's and 0's ascii characters). I need to calculate hamming distance with a test hasta, and it seems that the most efficient way to do is using mysql bit_count function. The problem is that I haven't found any way to convert/cast/whatever the hash field in order to be interpreted as a byte(8) instead of varchar(64). Like:

> select hash from data;
"0000000101100111111100011110000011100000111100011011111110011011"
> select convert_to_binary(hash) from data;
0b0000000101100111111100011110000011100000111100011011111110011011

I cannot alter the data and convert all previous data into a binary field. Is there any way to force mysql to re-interpret a field as a literal, or any other alternative solution?

sucotronic
  • 1,504
  • 9
  • 20

2 Answers2

0

I think you should be able to use it like this:

SELECT BIT_COUNT(CAST(hash AS BINARY)) FROM data;

Use CAST to convert the field to BINARY and pass the result to BIT_COUNT. Casting the field hash to binary will turn it into a string with the binary data type, but BIT_COUNT is able to handle that.

Louis Huppenbauer
  • 3,719
  • 1
  • 18
  • 24
  • Casting produces a 64byte data, so I cannot compare it against 64bit literal. I can cast also the literal from string to binary, but it produces wrong results. For example, **select bit_count(cast("011" as binary)^cast("111" as binary))** returns 3 and **select bit_count(cast("101" as binary)^cast("111" as binary))** returns 2. If you do this with binary data, the reply is ok, and always 1: **select bit_count(0b101^0b111)**, **select bit_count(0b011^0b111)** – sucotronic Dec 05 '14 at 11:35
0

Working code:

SELECT BIT_COUNT( CONV( hash, 2, 10 ) ^ 
0b0000000101100111111100011110000011100000111100011011111110011011 )
sucotronic
  • 1,504
  • 9
  • 20