We can use a bitmask to represent set presence in a finite (or at least indexed) domain efficiently, for instance to represent the letters in car
we could represent this in a 26-bit set like so:
abcdefghijklmnopqrstuvwxyz
10100000000000000100000000
However of course this can only represent presence, not duplicates - carry
for instance actually has two r
s, but a set cannot represent that.
A multiset represents a count, not just existence, so we can count duplicates, however it's not clear to me if this can be represented logically in a single number.
One idea, suggested by a coworker, would be to use primes as our indices, and represent a multiset by it's prime factorization. So our cases above would become:
car = 2^1 * 3^0 * 5^1 * ... * 61^1 * ....
carry = 2^1 * 3^0 * 5^1 * ... * 61^2 * ... 97^1 * 101^0
Is this a sound way to represent multisets? Are there better binary representations of such a concept?