1

I am trying to find similar images within a MySQL 8.0 table compared to a given phash.

The phashes are generated via python and currently stored within a varchar(255) field. Typically they look like this: ae95916ec1354a9d

My query is supposed to return the differnce in bytes:

SELECT m.*,
       BIT_COUNT( 0xae95916ec1354a9d ^ m.HASH) as hd,
       BIT_COUNT( 0xae95916ec1354a9d ^ concat('0x', m.HASH)) as hd,
       concat('0x', m.HASH)
from
     media m
where m.HASH is not null
ORDER BY hd ASC;

This unfortunatelly does not work and I suspect it is because there is a 0x missing in front of the hash.

The hashes are generated like this in Python:

    response2 = requests.get(url, stream=True)
    if response2:
        response2.raw.decode_content = True
        image2 = Image.open(response2.raw)
        hash2 = str(imagehash.phash(image2))

Am I saving the hash inside the wrong data type field or what is wrong with this approach?

merlin
  • 2,717
  • 3
  • 29
  • 59

0 Answers0