I'm attempting to generate aggregate HLL sketches in a Scala Spark job and push the data to a varbinary in Trino for dashboard aggregations.
I'm using the spark-alchemy library to generate the sketches in Spark, but continue to run into compatibility issues when running the cardinality
function in Trino. Specifically, the error
Cannot deserialize HyperLogLog
Trino uses the HLL implementation from airlift, but bringing that library in and writing UDFs around that implementation seems awfully cumbersome. Is there a more streamline way to obtain interoperability between Spark HLL and Trino HLL?