I was wondering if anyone is able to provide some insight into the chances of collisions when using FARM_FINGERPRINT in BigQuery to generate INT64 hashes to be used as Surrogate Keys on tables?
Going with a normal UUID increases storage of the key columns x4. I was thinking FARM_FINGERPRINT(GENERATE_UUID())
might provide an INT64 alternative. I know collisions are always a concern but reading the SMHasher output for FarmHash it looks like it could be an option as it is not showing any collision issues at present.
Other than the size I have users concerned about join performance on STRING vs INT64 surrogate keys in BigQuery. I cannot find anything official that speaks to it to calm the fears. Hence why considering this method to generate an INT64 hash.