I'm trying to use bloom/cuckoo filter to check if i already stored a tuple of 2 numbers f.e. (7,25), (47,1576), ... etc
In [12]: from cuckoo.filter import ScalableCuckooFilter
In [13]: c = ScalableCuckooFilter(initial_capacity=1000000, error_rate=0.0001)
In [14]: c.filters[0].buckets.buffer_info()
Out[14]: (140042688086032, 8500000, 'big', 0, 8500000, 0, 0, 0)
In [15]: 8500000/(1024*1024)
Out[15]: 8.106 MB
to store the numbers i would need 2bytes per number, so 1mln * 4bytes = 3.8 Mb
So the filter is bigger than the data !
Is there a way or structure that is suitable for numbers with lower memory footprint ?