0

I want to do dimension reduction with a 100-dimension vector v, then get a 10-dimension vector v'.

And the property below must be preserved:

For arbitrary vector w1, w2(100-dimension)
if v * w1 > v * w2(* rep inner product)
After reduction....
v' * w1' > v' * w2'

I learn that random projection is a method(http://scikit-learn.org/stable/modules/random_projection.html), but it preserve the value of distance and inner product. But I only want to keep the relative > or < property in stead of absolute distance/inner-product value.

The other problem in random projection is that it suits for large dimension reduction(10000-3000).

from sklearn.random_projection import johnson_lindenstrauss_min_dim
johnson_lindenstrauss_min_dim gives us a bound.

Below is my Python-Pseudo-Code to explain what I need:

import sys
import math
import numpy as np
def compare(a, b_lst):
    d_lst = []
    indx = 0
    for b in b_lst:
        d_lst.append((index, np.dot(a, b)))
        indx += 1
   return sorted(d_lst, key = lambda v : v[1])

x = np.random.rand(1, 100)
y = np.random.rand(5, 100)
result1 = compare(x, y)

# do projection
transformer = projection_method(object_dimension = 10)
x1 = transformer.transform(x)
y1 = transformer.transform(y)
result2 = compare(x1, y1)

for i in xrange(len(result1)):
    if result1[i][0] != result2[i][0]: # compare sorted index
        print 'failed'
        sys.exit(-1)
print 'passed'
Ray
  • 2,472
  • 18
  • 22
xunzhang
  • 2,838
  • 6
  • 27
  • 44

1 Answers1

1

There are no such ready made transforms. Even if there are that I am not aware of, no transformation is going to preserve such a property exactly. By reducing the dimension you are intrinsically losing information.

Raff.Edward
  • 6,404
  • 24
  • 34
  • I think this property is weak than keep absolute distance/inner-product values. – xunzhang Dec 09 '13 at 03:43
  • It would be, but the question is for the extreme case of only 100 dimensions (which is really small for this matter) down to only 10. The question is also phrased such that the results must be exact - which is not possible in general. – Raff.Edward Dec 09 '13 at 03:47
  • Actually, I'm not sure that it would be weaker. The norm between two points could be more complicated than the euclidean distance to maintain since it can be positive or negative. – Raff.Edward Dec 09 '13 at 03:52