Questions tagged [distance-matrix]

A distance matrix is a matrix containing the pairwise distances between the elements of a set. The "distance" may rely on the concept of metric.

The distance matrix for an N-elements set is a NxN matrix in which entry (i,j) contains the distance between items i and j.
The distance might be related to the concept of metric (e.g. Euclidean distance, Mahlanobis distance, and so on) but there are also non-metric distance matrices: an adjacency matrix is an example of such non-metric distance matrices.

174 questions
2
votes
2 answers

Computation of Distance Matrices for Binary Data in python

I am performing a hierarchical clustering analysis in python. My variables are binary so I was wondering how to calculate the binary euclidean distance. According to the literature, it is possible to use this distance metric with this clustering…
2
votes
1 answer

How to parallelize computation of pairwise distance matrix?

My problem is roughly as follows. Given a numerical matrix X, where each row is an item. I want to find each row's nearest neighbor in terms of L2 distance in all rows except itself. I tried reading the official documentation but was still a little…
user2804929
  • 117
  • 1
  • 7
2
votes
2 answers

Average geodetic distance between subsets of points (same ID) using distm(distVincentyEllipsoid) and storing the results in a new dataframe in R

My database has the following structure: > long <- c(13.2345, 14.2478, 16.2001, 11.2489, 17.4784, 27.6478, 14.2500, 12.2100, 11.2014, 12.2147) > lat <- c(47.1247, 48.2013, 41.2547, 41.2147, 40.3247, 46.4147, 42.4786, 41.2478, 48.2147,…
Mapos
  • 177
  • 1
  • 9
2
votes
1 answer

Connecting the given places using google maps api

Let say that a user wants to tour the US. When I list the places I want to travel with the origin and destination, I want the Google API to make a route in the shortest way possible. For Eg: My Origin: Las Vegas, Destination: New York. And the…
SO-user
  • 1,458
  • 2
  • 21
  • 43
2
votes
0 answers

OpenCL: Parallelize Lagged Vector Operation (Triangular distance matrix)

I'm having trouble creating a kernel for this serial code: void distlag(double *x, double *y,double *data, double *ncoord, double *res) { // x: x-coordinate // y: y-coordinate // data: value associated with x-y coordinate // ncoord:…
VMO
  • 66
  • 8
2
votes
2 answers

R Interclass distance matrix

This question is sort of a follow-up to how to extract intragroup and intergroup distances from a distance matrix? in R. In that question, they first computed the distance matrix for all points, and then simply extracted the inter-class distance…
itf
  • 559
  • 1
  • 3
  • 15
2
votes
1 answer

convert distance matrix to phylogenetic tree in newick string format

I have created a distance matrix by reading a FASTA file, now I'm asked to write a function that will produce a phylogenetic tree in newick string format. the function will take one argument a distance matrix. Can you please help me with some code…
2
votes
1 answer

Convert Pandas Distance Matrix DataFrame to Rows

I have a pandas DataFrame with n ids as columns and indices. Is there some easy method that I'm not considering to convert this to n2 rows of distances like so? id1 id2 id1 0 .75 id2 .75 0 to id1 id2 .75 id1 id1 0 id2 id2 …
Cody Braun
  • 659
  • 6
  • 17
1
vote
2 answers

Distance matrix for every row of coordinates in a dataframe

I have a dataframe with many rows, and each row contains a sample ID and two samples, which I am treating as coordinates. I want to calculate the euclidean distance between each set of coordinates (i.e., each row) to generate a distance matrix…
1
vote
0 answers

Merge two phyloseq with different taxonomic data

I have 7 samples, I can successfully create a phyloseq file, make a distance matrix and calculate the beta diversity with adonis2. However, I want to add one more sample from an earlier sequencing, so I have to work with a different shared, taxonomy…
vicey
  • 13
  • 3
1
vote
1 answer

Passing a Distance Matrix to sklearn's K-Means Clustering

I am currently doing research using the ASJP Database and I have a distance matrix of the similarities between 30 languages in the shape of (30 x 30). I would like to perform K-Means Clustering on these languages. I passed the distance matrix to…
1
vote
1 answer

Vectorizing Mahalanobis distance - numpy

I have been looking at the answer from @Danita's answer (Vectorizing code to calculate (squared) Mahalanobis Distiance), which uses np.einsum to calculate the squared Mahalanobis distance. In that case, the vectors are: X of shape (m, n), U of shape…
1
vote
0 answers

Plotting a distance matrix according to another distance matrix

So I have 50 samples, for which I have data on the chemical composition, and the spatial coordinates. With those, I created a Bray-Curtis dissimilarity matrix of the chemical similarity between samples, and a distance matrix based on the spatial…
1
vote
1 answer

Clustering a pairwise distance matrix

So assuming I have a precomputed distance matrix 1 2 3 4 5 1 0.000 1.154 1.235 1.297 0.960 2 1.154 0.000 0.932 0.929 0.988 3 1.235 0.932 0.000 0.727 1.244 4 1.297 0.929 0.727 …
Philipp O.
  • 41
  • 1
  • 4
1
vote
0 answers

Get the most different combinations while distributing values equally

I'm trying to get the most different combinations of a given set of variables:values but keeping every element more or less equally distributed. For example, for given: { 'cat0' : [0,1,2,3,4,5], 'cat1' : [0,1,2,3,4,5], 'cat2' :…
1 2
3
11 12