Questions tagged [hamming-distance]

The Hamming distance is a mathematical distance function for a pair of strings (sequences) that can be computed with a binary calculation. It counts the number of symbols in the string that are different. Posts that are not about implementation may belong on https://math.stackexchange.com.

For the special case of two binary strings, it may be implemented as the bitcount of their XOR:

int H(int a, int b){
    return popcount(a^b);
}

291 questions

votes

1 answer

Eliminating sequences in a message

I have an odd communications channel, and I need to detect errors as well as eliminate certain sequences in the channel. Each message is 12 bits long, split into 3 nibbles (of 4 bits each). I need to extract at least 450 different codes out of this,…

asked May 15 '13 at 01:03

Adam Davis

91,931
60
264
330

votes

2 answers

Bit string nearest neighbour searching

I have hundreds of thousands of sparse bit strings of length 32 bits. I'd like to do a nearest neighbour search on them and look-up performance is critical. I've been reading up on various algorithms but they seem to target text strings rather than…

compression hash nearest-neighbor hamming-distance

asked Mar 31 '12 at 21:13

user1305541

votes

1 answer

Fast computation of pairs with least hamming distance

Problem Suppose you have N (~100k-1m) integers/bitstrings each K (e.g. 256) bits long. The algorithm should return the k pairs with the lowest pairwise Hamming distance. Example N = 4 K = 8 i1 = 00010011 i2 = 01010101 i3 = 11000000 i4 =…

algorithm hamming-distance

asked Aug 16 '11 at 22:58

rfalke

votes

1 answer

OpenMP takes more time than expected

So, I am facing some difficulties using openMp. I am beginner and I do not know what I am doing wrong. This is a project for one of my courses at University, so I do not seek for the solution, rather for a hint or an explanation. The project is to…

c parallel-processing openmp hamming-distance

asked Mar 15 '18 at 23:25

Sotiris Dimitras

votes

1 answer

Optimized CUDA matrix hamming distance

Is anyone aware of an optimized CUDA kernel for computing a GEMM style hamming distance between two matrices of dimension A x N and N x B? The problem is nearly identical to GEMM, but instead computes the sum( a_n != b_n ) for each vector {1 ...…

c++ c matrix cuda hamming-distance

asked Jul 09 '16 at 00:56

lakinsm

votes

1 answer

Python - grouping defaultdict values by hamming distance in keys

I have a default dict with ~700 keys. The keys are in a format such as A_B_STRING. What I need to do is split the key by '_', and compare the distance between 'STRING' for each key, if A and B are identical. If the distance is <= 2, I want to group…

python combinations string-comparison defaultdict hamming-distance

asked Jun 10 '16 at 15:48

st.ph.n

votes

3 answers

Fastest way to compute all 1-hamming distanced neighbors of strings?

I am trying to compute hamming distances between each node in a graph of n nodes. Each node in this graph has a label of the same length (k) and the alphabet used for labels is {0, 1, *}. The '*' operates as a don't care symbol. For example, hamming…

java string hamming-distance

asked Apr 09 '15 at 15:45

begumgenc

votes

2 answers

Interpreting Hamming Distance speed in python

I've been working on making my python more pythonic and toying with runtimes of short snippets of code. My goal to improve the readability, but additionally, to speed execution. This example conflicts with the best practices I've been reading…

python runtime timeit hamming-distance

asked Feb 04 '15 at 19:32

Scrocco

votes

1 answer

Find all k-nearest neighbors

Problem: I have N (~100m) strings each D (e.g. 100) characters long and with a low alphabet (eg 4 possible characters). I would like to find the k-nearest neighbors for every one of those N points ( k ~ 0.1D). Adjacent strings define via hamming…

algorithm cluster-analysis computational-geometry hamming-distance

asked Jan 28 '15 at 10:04

Ashki

votes

2 answers

Sum of Hamming Distances

I started preparing for an interview and came across this problem: An array of integers is given Now calculate the sum of Hamming distances of all pairs of integers in the array in their binary representation. Example: given {1,2,3} or…

optimization hamming-distance

asked Jan 22 '15 at 18:15

RoadToAnInternship

votes

1 answer

How do I find the k-nearest values in n-dimensional space?

I read about kd-trees but they are inefficient when the dimensionality of the space is high. I have a database of value and I want to find the values that are within a certain hamming distance of the query. For instance, the database is a list of…

computational-geometry multivariate-partition minhash hamming-distance

asked Mar 06 '10 at 13:49

Eyal

5,728
7
43
70

votes

2 answers

Word distance algorithm for OCR

I am working with an OCR output and I'm searching for special words inside it. As the output is not clean, I look for elements that match my inputs according to a word distance lower than a specific threshold. However, I feel that the Levenshtein…

algorithm ocr levenshtein-distance hamming-distance

asked Mar 31 '14 at 10:03

zenbeni

7,019
3
29
60

votes

2 answers

Fast way of doing k means clustering on binary vectors in c++

I want to cluster binary vectors (millions of them) into k clusters.I am using hamming distance for finding the nearest neighbors to initial clusters (which is very slow as well). I think K-means clustering does not really fit here. The problem is…

vector binary cluster-analysis hamming-distance

asked Jun 10 '13 at 21:29

NightFox

votes

4 answers

The hunt for the fastest Hamming Distance C implementation

I want to find how many different characters two strings of equal length have. I have found that xoring algorithms are considered to be the fastest, but they return distance expressed in bits. I want the results expressed in characters. Suppose that…

c performance assembly optimization hamming-distance

asked Apr 13 '13 at 12:07

pdrak

votes

2 answers

What is the most efficient way to identify text similarity between items in large lists of strings in Python?

The following piece of code achieves the results I'm trying to achieve. There is a list of strings called 'lemmas' that contains the accepted forms of a specific class of words. The other list, called 'forms' contains a lot of spelling variations of…

python text nlp cosine-similarity hamming-distance

asked Apr 05 '23 at 20:54

jfontana

Prev 1 2 3

…

19 20 Next