21

There are several posts on computing pairwise differences among vectors, but I cannot find how to compute all differences within a vector.

Say I have a vector, v.

v<-c(1:4)

I would like to generate a second vector that is the absolute value of all pairwise differences within the vector. Similar to:

abs(1-2) = 1
abs(1-3) = 2
abs(1-4) = 3
abs(2-3) = 1
abs(2-4) = 2
abs(3-4) = 1

The output would be a vector of 6 values, which are the result of my 6 comparisons:

output<- c(1,2,3,1,2,1)

Is there a function in R that can do this?

colin
  • 2,606
  • 4
  • 27
  • 57

3 Answers3

22
as.numeric(dist(v))

seems to work; it treats v as a column matrix and computes the Euclidean distance between rows, which in this case is sqrt((x-y)^2)=abs(x-y)

If we're golfing, then I'll offer c(dist(v)), which is equivalent and which I'm guessing will be unbeatable.

@AndreyShabalin makes the good point that using method="manhattan" will probably be slightly more efficient since it avoids the squaring/square-rooting stuff.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
15

Let's play golf

abs(apply(combn(1:4,2), 2, diff))

@Ben, yours is a killer!

> system.time(apply(combn(1:1000,2), 2, diff))
   user  system elapsed 
   6.65    0.00    6.67 
> system.time(c(dist(1:1000)))
   user  system elapsed 
   0.02    0.00    0.01 
> system.time({
+ v <- 1:1000
+ z = outer(v,v,'-');
+ z[lower.tri(z)];
+ })
   user  system elapsed 
   0.03    0.00    0.03 

Who knew that elegant (read understandable/flexible) code can be so slow.

mlt
  • 1,595
  • 2
  • 21
  • 52
  • 3
    can you beat `c(dist(v))` (10 characters) ? (yours could be slightly shorter if you substitute `v` for `1:4`) – Ben Bolker Jun 19 '14 at 19:47
  • Huh! I thought @BenBolker would win the tournament. I'd assume my example introduces 3 commonly used functions in one line as a bonus. – mlt Jun 19 '14 at 19:56
  • They both work well. I accepted this answer because I could understand what was going on more clearly. Furthermore, I can easily modify this code to do other pairwise operation (i.e. compute pairwise sums for the same vector). I can't do that with the dist() function. – colin Jun 19 '14 at 20:03
  • it will be interesting to see how the benchmarking goes, if @Vlo actually does it. This solution does fewer than half as many operations, but it does them less efficiently. – Ben Bolker Jun 19 '14 at 20:07
  • 1
    we could all switch to Julia :-) – Ben Bolker Jun 19 '14 at 20:28
  • 5
    +1 `combn` has a `FUN` argument, so you could just do: `combn(v,2,FUN=diff)` – Matthew Plourde Jun 19 '14 at 20:38
  • @MatthewPlourde Wow! Never cared to check for this function details. It takes 2/3 of time of that with apply. – mlt Jun 19 '14 at 20:41
14

A possible solution is:

z = outer(v,v,'-'); 
z[lower.tri(z)];

[1] 1 2 3 1 2 1
Andrey Shabalin
  • 4,389
  • 1
  • 19
  • 18