I have a big doubt about the way I should cluster sets using MinHash together with the banding technique.
I assume everyone reading has a good knowledge of MinHash so I won't define most of the terms I'm using.
My goal is to use MinHash to cluster users according to the similarity of their signatures. In a local, non-banded settings this would be trivial: if their signature hash is the same, they go in the same cluster.
If we split signatures in bands and process them indipendently, I can treat a band as I said before and generate a group of clusters for every band. My question is: how should I aggregate these clusters? Just merge them if they have at least an element in common? Or should I do something different?
Thanks