Questions tagged [lsh]

Locality-sensitive hashing

Locality-sensitive hashing reduces the dimensionality of high-dimensional data. LSH hashes input items so that similar items map to the same “buckets” with high probability (the number of buckets being much smaller than the universe of possible input items). LSH differs from conventional and cryptographic hash functions because it aims to maximize the probability of a “collision” for similar items.1 Locality-sensitive hashing has much in common with data clustering and nearest neighbor search.

48 questions
-1
votes
1 answer

Reverse TF-IDF vector (vec2text)

Given a generated doc2vec vector on some document. is it possible to reverse the vector back to the original document? If so, does there exist any hash algorithm that would make the vector irreversible but still comparable to other vectors of the…
-1
votes
1 answer

Questions about LSH (Locality-sensitive hashing) and minihashing implementation

I'm trying to implement this paper Browser Fingerprint Coding Methods Increasing the Effectiveness of User Identification in the Web Traffic I got a couple of questions about the LHS algorithm in general and the proposed implementation: The LSH…
ianux22
  • 405
  • 4
  • 16
-1
votes
1 answer

Increase of hash tables in MinHashLSH, decreases accuracy and f1

I have used MinHashLSH with approximateSimilarityJoin with Scala and Spark 2.4 to find edges between a network. Link prediction based on document similarity. My problem is that while I am increasing the hash tables in the MinHashLSH, my accuracy and…
atheodos
  • 131
  • 12
1 2 3
4