Questions tagged [locality-sensitive-hash]

Locality-sensitive hashing (LSH) is a method of probabilistic dimension reduction.

Locality-sensitive hashing (LSH) is a method of performing probabilistic dimension reduction of high-dimensional data. The basic idea is to hash the input items so that similar items are mapped to the same buckets with high probability (the number of buckets being much smaller than the universe of possible input items).

97 questions
0
votes
1 answer

merging social media profiles (Locality Sensitive Hashing)

Was wondering if anyone had any nice reading material regarding the topic which is pretty self explanatory what I want to do is create a program which is able to merge different social media profiles into one profile. So for example if I have a…
0
votes
1 answer

Finding and suggesting most similar queries from a query log

Given a query log of about 10 million queries I have to write a program that will ask query from the user and display most similar 10 queries to the input query as a output. Also in case of spelling mistakes it may suggest the correct spellings.…
Joy
  • 4,197
  • 14
  • 61
  • 131
0
votes
2 answers

Quick/easy array comparison algorithm without sharing data

I have two arrays generated by two different systems that are independent of each other. I want to compare their similarities by comparing only a few numbers generated from the arrays. Right now, I'm only comparing min, max, and sums of the arrays,…
CookieOfFortune
  • 13,836
  • 8
  • 42
  • 58
-1
votes
1 answer

Generate same hashcode for vectors that have jaccard similarity above a certain threshold

Modifying the hashCode() method in java such that vectors can generate same hashcode for vectors that have jaccard similarity above a certain threshold with good accuracy example: vector 1: [1,1,0,0,1,0] vector 2: [1,1,0,0,0,0] they have jaccard…
dydy
  • 1
  • 1
-1
votes
1 answer

Nearest neighbor search vs Near duplicate detection

I was searching for a few AI/ML and non-AI/ML solutions for the "Near duplicate detection" problem (text, image, audio), I found that there is a similar/exact problem i,e, "Nearest neighbor search" which is also seems handled exactly the same way as…
-1
votes
1 answer

Questions about LSH (Locality-sensitive hashing) and minihashing implementation

I'm trying to implement this paper Browser Fingerprint Coding Methods Increasing the Effectiveness of User Identification in the Web Traffic I got a couple of questions about the LHS algorithm in general and the proposed implementation: The LSH…
ianux22
  • 405
  • 4
  • 16
-1
votes
1 answer

trying to understand LSH through the sample python code

the concise python code i study for is here Question A @ line 8 i do not really understand the syntax meaning for "res = res << 1" for the purpose of "get_signature" Question B @ line 49 (SOLVED BY myself through another Q&A) "xor = r1^r2" does not…
user381509
  • 65
  • 7
1 2 3 4 5 6
7