0

I have many 8-dimensional vectors. And I want to construct on them a R-tree ('USING gist' in PostgreSQL). But I have got stucked at the choice of the correct geometrical type.

Whether the 'path' type (each vector number as x and y coords of path) will be suitable for my purposes? Or I should use additional extension? Or I follow in the incorrect direction?

Thanks in advance!

Vladislav
  • 43
  • 6
  • If your data fits into main memory, use a dedicated analysis tool like ELKI instead of a SQL database. In my experience, the performance of gist is not very good; it only pays off if you need persistence and transactions. – Has QUIT--Anony-Mousse Mar 11 '16 at 16:52
  • @Anony-Mousse Thanks for advice! I used this program. But I have wanted to estimate parameters for DBSCAN, using k-dist from original article. I didn't think that I will have such difficulties. And ELKI also uses r-trees unlike scikit? – Vladislav Mar 11 '16 at 19:31
  • ELKI has r-trees and many other indexes (I mostly use cover trees), and a class (something with knn) that you can use to estimate the epsilon parameter. Or you use OPTICS. – Has QUIT--Anony-Mousse Mar 11 '16 at 19:37
  • @Anony-Mousse Yes! KNNDistancesSampler - what is necessary! Thanks a lot! – Vladislav Mar 11 '16 at 19:42
  • Depending on how many 'many vectors' are, you may also look at the ELKI-phtree plug-in, it should work nicely with a million vectors or more. – TilmannZ Mar 12 '16 at 19:43

0 Answers0