1

I have Vectors with size n=1000..3000 which have a mean=0 and std-dev=1/sqrt(n) and length/norm=1.

My question is what formula, algorithm I can use to reduce the vectors size but preserve the DOT product similarity i.e. after re-scaling the dot product between any 2 vectors should be approximately the same as the original vectors ?


random selection seems to work:

% : normalizes the operands and does the DOT product..

In [34]: ab % bc
Out[34]: 0.445

In [35]: ab[::2] % bc[::2]
Out[35]: 0.424

In [36]: ab[::3] % bc[::3]
Out[36]: 0.440

In [39]: rnd = np.random.randint(0,1000,300)

In [40]: ab[rnd] % bc[rnd]
Out[40]: 0.450
sten
  • 7,028
  • 9
  • 41
  • 63

0 Answers0