Questions tagged [hamming-distance]

The Hamming distance is a mathematical distance function for a pair of strings (sequences) that can be computed with a binary calculation. It counts the number of symbols in the string that are different. Posts that are not about implementation may belong on https://math.stackexchange.com.

For the special case of two binary strings, it may be implemented as the bitcount of their XOR:

int H(int a, int b){
    return popcount(a^b);
}
291 questions
3
votes
1 answer

Most efficient way to find intersections between many sets of numbers

I am trying to efficiently compress sets of numbers which look like this (one set per line): 19 20 23 24 27 29 32 35 69 97 99 119 122 129 132 134 136 137 139 141 147 148 152 157 158 160 170 173 174 175 176 178 179 182 183 185 186 188 189 190 192 194…
3
votes
2 answers

Storing and indexing binary strings in a database

A binary string as defined here is fixed size "array" of bits. I call them strings since there is no order on them (sorting/indexing them as numbers has no meaning), each bit is independent of the others. Each such string is N bits long, with N in…
Adi Shavit
  • 16,743
  • 5
  • 67
  • 137
3
votes
1 answer

How to implement differentiable hamming loss in pytorch?

How to implement a differentiable loss function that counts the number of wrong predictions? output = [1,0,4,10] target = [1,2,4,15] loss = np.count_nonzero(output != target) / len(output) # [0,1,0,1] -> 2 / 4 -> 0.5 I have tried a few…
Oleg Dats
  • 3,933
  • 9
  • 38
  • 61
3
votes
3 answers

(Speed Challenge) Any faster way to compute distance matrix in terms of generic Hamming distance?

I am looking for a more efficient way to get the distance matrix in terms of Hamming distance. Backgrounds I know there is a function hamming.distance() from package e1071 to compute the distance matrix, but I suspect it might be very slow when…
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
3
votes
2 answers

Numpy array is much slower than list

Given two matrices X1 (N,3136) and X2 (M,3136) (where every element in every row is an binary number) i am trying to calculate hamming distance so that each element in X1 is compared to all of the rows from X2, such that result matrix is (N,M). I…
Derik Daro
  • 97
  • 2
  • 9
3
votes
1 answer

Get all pairs of strings (DNA) seperated by a distance of Hamming = 1

I have an array of DNA sequences, like: AA TA AC CC and I search a faster way to calculate the hamming distance between all sequences pairs (maybe by sorting...), then naive approach (O(N^2)) For motif1 in array For motif2 in array …
Chadi
  • 97
  • 6
3
votes
3 answers

Computing pairwise Hamming distance between all rows of two integer matrices/data frames

I have two data frames, df1 with reference data and df2 with new data. For each row in df2, I need to find the best (and the second best) matching row to df1 in terms of hamming distance. I used e1071 package to compute hamming distance. Hamming…
alaj
  • 187
  • 1
  • 10
3
votes
1 answer

Calculate Hamming weight and/or distance in VBA Excel

I’m trying to compare clients, two by two, whose qualities can be defined by binary choices (for example a client uses a product or not). After much search online, it looks like I’d need to use the Hamming Distance for that, or its equivalent: find…
P. O.
  • 185
  • 1
  • 14
3
votes
1 answer

Algorithm for generating a size k error-correcting code on n bits

I want to generate a code on n bits for k different inputs that I want to classify. The main requirement of this code is the error-correcting criteria: that the minimum pairwise distance between any two encodings of different inputs is maximized. I…
Elliot JJ
  • 543
  • 6
  • 19
3
votes
2 answers

levenshtein matrix cell calculation

I do not understand how the values in the levenshtein matrix is calculated According to this article. I do know how we arrive at the edit distance of 3. Could someone explain in lay man terms how we arrive at each value in each cell?
jxn
  • 7,685
  • 28
  • 90
  • 172
3
votes
1 answer

Calculate the Hamming Distance between the two same datasets

How to Calculate the Hamming Distance between two datasets of same points?Both the data sets look exactly the same. http://postimg.org/image/u11qnsolh/ There are two datasets of same number of points. total number of points -19 First data set has…
Irfan
  • 25
  • 5
3
votes
1 answer

determine strings that satisfy hamming distance matrix

I am trying to create a list of strings from a hamming distance matrix. Each string must be 20 characters long with a 4 letter alphabet (A,B,C,D). For example, say I have the following hamming distance matrix: S1 S2 S3 S1 0 5 12 S2 5 0 14 S3…
ChrisUofR
  • 73
  • 6
3
votes
1 answer

Finding hamming distance between ORB feature descriptors

I am trying to write a function to match ORB features. I am not using default matchers (bfmatcher, flann matcher) because i just want match speific features in image with features in other image. I saw ORS descriptor its a binary array. My query is…
nayab
  • 2,332
  • 1
  • 20
  • 34
3
votes
1 answer

Calculate distance between two descriptors

I'm trying to calculate the distance (Euclidean or hamming) between two descriptors already calculated. The problem is I don't want to use a matcher, I just want to calculate the distance between two descriptors. I'm using OpenCV 2.4.9 and i have…
zedv
  • 1,459
  • 12
  • 23
3
votes
1 answer

How should I store and compute Hamming distance between binary codes?

How can I efficiently store binary codes? For certain fixed sizes, such as 32 bits, there are primitive types that can be used. But what if I my binary codes are much longer? What is the fastest way to compute the Hamming distance between two…
mrgloom
  • 20,061
  • 36
  • 171
  • 301