I am counting various patterns in graphs, and I store the relevant information in a defaultdict of lists of numpy arrays of size N, where the index values are integers.
I want to efficiently know if I am creating a duplicate array. Not removing duplicates can exponentially grow the amount of duplicates to the point where what I am doing becomes infeasible. But there are potentially hundreds of thousands of arrays, stored in different lists, under different keys. As far as I know, I can't hash an array.
If I simply needed to check for duplicate nonzero indices, I would store the nonzero indices as a bit sequence of ones and then hash that value. But, I don't only need to check the indices - I need to also check their integer values. Is there any way to do this short of coming up with a completely knew design that uses different structures?
Thanks.