Questions tagged [hamming-distance]

The Hamming distance is a mathematical distance function for a pair of strings (sequences) that can be computed with a binary calculation. It counts the number of symbols in the string that are different. Posts that are not about implementation may belong on https://math.stackexchange.com.

For the special case of two binary strings, it may be implemented as the bitcount of their XOR:

int H(int a, int b){
    return popcount(a^b);
}
291 questions
2
votes
1 answer

What's the relationship between Hamming distance and Simple Matching Coefficient?

I'm doing exercises of Introduction to Data Mining, and got stuck in following question: Which approach, Jaccard or Hamming distance, is more similar to the Simple Matching Coefficient, and which approach is more similar to the cosine measure?…
Nia
  • 99
  • 1
  • 2
  • 6
2
votes
3 answers

Find Strings with certain Hamming distance LINQ

If we run the following (thanks to @octavioccl for help) LINQ Query: var result = stringsList .GroupBy(s => s) .Where(g => g.Count() > 1) .OrderByDescending(g => g.Count()) .Select(g => g.Key); It gives us all the strings which occur…
Failed Scientist
  • 1,977
  • 3
  • 29
  • 48
2
votes
1 answer

Best way to compare data elements using Fuzzy Match Algorithms

I am looking to compare two data elements or fields via Fuzzy Match Algorithm for Record Linkage in C#, and I want to determine which algorithm would be best for each comparison. The fields I am looking to compare are: Last Name First…
user6159282
2
votes
3 answers

Simple and Quick way to calculate the hamming distance of a binary integer to 0?

I'm writing a Sudoku solver and I have to calculate the what I learned was called the hamming distance of an int to 0, e.g. the hamming distance of 7 (111 in binary) to 0 is 3. So I simply do: for(int dist = 0 ; num != 0 ; num>>=1) dist +=…
Maljam
  • 6,244
  • 3
  • 17
  • 30
2
votes
2 answers

how to calculate all nearest neighbours from a byte in a hamming ball

I want to calculate all possbie hamming neighbours from a given byte with a maximum hamming distance. For a hamming distance of 1 I have created this function: public static ArrayList hammingNeighbours(byte input, int maxDistance){ …
501 - not implemented
  • 2,638
  • 4
  • 39
  • 74
2
votes
0 answers

Calculating hamming distance with mongodb?

say i've got a large set of documents that contain perceptual hashes (around 35,000), what is the fastest way that I can (using mongodb) compare a given hash X to all the hashes in my database and find the ones with a distance less than N. I'm using…
davegri
  • 2,206
  • 2
  • 26
  • 45
2
votes
3 answers

input 2 integers and get binary, brgc, and hamming distance

I've got everything except hamming distance. I keep getting the error "int() can't convert non-string with explicit base" here is my code: def int2bin(n): if n: bits = [] while n: …
Joe
  • 21
  • 1
  • 4
2
votes
0 answers

Select all rows that have a hamming distance less than n?

I have a database containing a column "hash" for each row that is a varchar string. is there a way in SQL to select all the rows that have a hamming distance of less than n from a string I provide. for example if I provide the string…
davegri
  • 2,206
  • 2
  • 26
  • 45
2
votes
2 answers

Calculate graph distance based on different edge types

Let's say I have the following unweighted, undirected graph where edges can be connected by two different types of edges: support edges (green) and opposition edges (red). Here's an example: I want to calculate the "distance" of opposition or…
2
votes
1 answer

Calculate hamming distance in perl

I have the following list of words (words.txt) in a file shown in IPA characters (international phonetic alphabet). Below, I have assigned each IPA character with a binary code in a separate file (sounds.txt). I want to compare each word in the…
Mck18
  • 41
  • 3
2
votes
1 answer

insert into vantage-point tree

Given a large collection of 64 bit integers, my goal is to find the integer with the smallest Hamming distance from a new integer, after which the new integer will be inserted in the collection. For this practice, I plan on using a vantage-point…
Aart Stuurman
  • 3,188
  • 4
  • 26
  • 44
2
votes
5 answers

R - Compute Mismatch By Group

I was wondering how could I compute mismatching cases by group. Let us imagine that this is my data : sek = rbind(c(1, 'a', 'a', 'a'), c(1, 'a', 'a', 'a'), c(2, 'b', 'b', 'b'), c(2, 'c', 'b', 'b')) colnames(sek) <-…
giac
  • 4,261
  • 5
  • 30
  • 59
2
votes
0 answers

Google + Hamming (or binary) Search

I've read here: How do search engines merge results from an inverted index? "query execution algorithms are actually fairly dumb" and there are many sofisticated algorithms using Hamming distance that are supposed to be used in a search…
2
votes
7 answers

How to calculate the hamming distance of two binary sequences in PHP?

hamming('10101010','01010101') The result of the above should be 8. How to implement it?
user198729
  • 61,774
  • 108
  • 250
  • 348
2
votes
1 answer

R: clustering documents

I've got a documentTermMatrix that looks as follows: artikel naam product personeel loon verlof doc 1 1 1 2 1 0 0 doc 2 1 1 1 0 0 0 doc 3 0 0 1 1 …
Anita
  • 759
  • 1
  • 10
  • 23