I am trying to find similar images within a MySQL 8.0 table compared to a given phash.
The phashes are generated via python and currently stored within a varchar(255) field. Typically they look like this: ae95916ec1354a9d
My query is supposed to return the differnce in bytes:
SELECT m.*,
BIT_COUNT( 0xae95916ec1354a9d ^ m.HASH) as hd,
BIT_COUNT( 0xae95916ec1354a9d ^ concat('0x', m.HASH)) as hd,
concat('0x', m.HASH)
from
media m
where m.HASH is not null
ORDER BY hd ASC;
This unfortunatelly does not work and I suspect it is because there is a 0x missing in front of the hash.
The hashes are generated like this in Python:
response2 = requests.get(url, stream=True)
if response2:
response2.raw.decode_content = True
image2 = Image.open(response2.raw)
hash2 = str(imagehash.phash(image2))
Am I saving the hash inside the wrong data type field or what is wrong with this approach?