I have Vectors with size n=1000..3000 which have a mean=0 and std-dev=1/sqrt(n) and length/norm=1.
My question is what formula, algorithm I can use to reduce the vectors size but preserve the DOT product similarity i.e. after re-scaling the dot product between any 2 vectors should be approximately the same as the original vectors ?
random selection seems to work:
% : normalizes the operands and does the DOT product..
In [34]: ab % bc
Out[34]: 0.445
In [35]: ab[::2] % bc[::2]
Out[35]: 0.424
In [36]: ab[::3] % bc[::3]
Out[36]: 0.440
In [39]: rnd = np.random.randint(0,1000,300)
In [40]: ab[rnd] % bc[rnd]
Out[40]: 0.450