Questions tagged [hamming-distance]

The Hamming distance is a mathematical distance function for a pair of strings (sequences) that can be computed with a binary calculation. It counts the number of symbols in the string that are different. Posts that are not about implementation may belong on https://math.stackexchange.com.

For the special case of two binary strings, it may be implemented as the bitcount of their XOR:

int H(int a, int b){
    return popcount(a^b);
}
291 questions
2
votes
3 answers

Quick way to count number of position match of a given character between all rows pairwise

I have a matrix and I want to identify the number of times that each character appears in the same position between all pairwise. A example of the way I'm doing is below, but my matrix has 10,000 rows and it's taking too long. # This code will…
celacanto
  • 315
  • 2
  • 11
2
votes
1 answer

loop different rows from dataframe A and apply them to the same individual rows in dataframe B

As I am working on a Q-Matrix question and needing to calculate hamming distance, I am having problem of different rows (as vectors) from dataframe A to the same row in dataframe B. Specifically, df2 is a 6x3 dataframe and df1 is a 3x3 dataframe. …
Edward Lin
  • 609
  • 1
  • 9
  • 16
2
votes
0 answers

What is the term for the conceptual distance between ordered nodes?

What is the name for the ordered relation between nodes? For example: A color ontology represented in a trie has ordered color objects such that the marginal node between yellow and blue is green, and node between blue and green is teal, etc. I…
forest.peterson
  • 755
  • 2
  • 13
  • 30
2
votes
2 answers

nested loops avoiding self and reciprocal comparisons

Since this must be quite a common scenario I'm wondering whether there are any pre-existing solutions to the following. Lets say I have a set of N strings, and I am computing distances the distances between them. In this case it's a Hamming…
Joe Healey
  • 1,232
  • 3
  • 15
  • 34
2
votes
1 answer

hamming_loss not support in cross_val_score?

I am using sklearn and skmultilearn to do some research about multilabel. I was just wondering that why the hamming_loss can not be used in cross_val_score, since it can really be used alone.
iromise
  • 23
  • 2
2
votes
1 answer

Data structure for Hamming cube

I have a Hamming cube, of general dimension, but in practice, usually, the dimension ranges from 3 to 6. The search algorithm is: Input: any vertex, `v`. Find all vertices that lie in Hamming distance 1 from `v`. Find all vertices that lie in…
gsamaras
  • 71,951
  • 46
  • 188
  • 305
2
votes
0 answers

Using pdsit with string value in python scipy

I have a following code and I want to calculate the hamming strings of the strings: from pandas import DataFrame import numpy as np import pandas as pd from scipy.spatial.distance import pdist, squareform df = pd.read_csv("3d_printing.csv",…
Cenk_Mitir
  • 113
  • 11
2
votes
2 answers

Calculate Hamming distance between two strings of binary digits in Matlab

I have two equal length strings containing 1's and 0's. Each string is 128-bits long, and I want to calculate the Hamming distance between them. What's the best way I can go about doing this? e.g. a='1000001' and b='1110001' --> dist=Hamming(a,b);
GobiasKoffi
  • 4,014
  • 14
  • 59
  • 66
2
votes
4 answers

Generate all sequences of bits within Hamming distance t

Given a vector of bits v, compute the collection of bits that have Hamming distance 1 with v, then with distance 2, up to an input parameter t. So for 011 I should get ~~~ 111 001 010 ~~~ -> 3 choose 1 in number 101 000 110 ~~~ -> 3 choose…
gsamaras
  • 71,951
  • 46
  • 188
  • 305
2
votes
1 answer

Optimize Hamming Distance Python

I have around 1M of binary numpy array which I need to get Hamming Distance between them to found de k-nearest-neighbours, the fastest method that I get is using cdist, returning a float matrix with distance. Since I don't have memory enough to get…
jevanio
  • 60
  • 2
  • 12
2
votes
0 answers

Fastest way to calculate Hamming Distance in C#

I have a large collection (n = 20,000,000) of BigInteger, representing bit arrays of length 225. Given a single BigInteger, I want to find the x BigInteger within my collection below a certain Hamming distance. Currently, I convert all BigInteger to…
aseipel
  • 728
  • 7
  • 15
2
votes
1 answer

Similarity of two Hexadecimal numbers

I am trying to find similar hashes (hexadecimal hash) using hamming and Levenshtein distance. Lets say two hashes are similar if their hamming distance is less than 10 (number of differing bits). Hash 1= ffffff (base 16) Hash 2= fffff0 (base…
Anandan
  • 353
  • 3
  • 17
2
votes
2 answers

Quickest way to find smallest Hamming distance in a list of fixed-length hexes

I'm using Imagehash in Python to generate 48-digit hexadecimal hashes of around 30,000 images, which I'm storing in a list of dictionaries (the phashes as well as some other image properties). For example: [{"name":"name1",…
IronWaffleMan
  • 2,513
  • 5
  • 30
  • 59
2
votes
1 answer

generate N numbers such that hamming distance between each of them is at least D

Given N, B, and D: Find a set of N codewords (1 <= N <= 64), each of length B bits (1 <= B <= 8), such that each of the codewords is at least Hamming distance of D (1 <= D <= 7) away from each of the other codewords. The Hamming distance between a…
2
votes
3 answers

how to code vector to matrix hamming distance in Python?

I like to develop a query system that finds the most similar items to given one based on a binary signature extracted from the data. I probe for the most efficient way since I have runtime constraints. I tried to use scipy distance but it was too…
erogol
  • 13,156
  • 33
  • 101
  • 155