6

I have a large number of vector triples, and I would like to compute the scalar triple product for them. I can do

import numpy

n = 871
a = numpy.random.rand(n, 3)
b = numpy.random.rand(n, 3)
c = numpy.random.rand(n, 3)

# <a, b x c>
omega = numpy.einsum('ij, ij->i', a, numpy.cross(b, c))

but numpy.cross is fairly slow. The symmetry of the problem (its Levi-Civita expression is eps_{ijk} a_i b_j c_k) suggests that there might be a better (faster) way to compute it, but I can't seem to figure it out.

Any hints?

Nico Schlömer
  • 53,797
  • 27
  • 201
  • 249
  • 1
    Past questions have shown that for what it does, the current `cross` is efficient, http://stackoverflow.com/q/39662540/901925 – hpaulj Feb 10 '17 at 12:49
  • Cross products alone are fine, this one is with a dot product that gives it a particular symmetry. It strikes me as odd that there should be no further optimization for it. – Nico Schlömer Feb 10 '17 at 12:53
  • if A, B and C are specific 3-vectors, the Einstein notation you are looking for is `\epsilon_{ijk} A^i B^j C^k`, when epsilon is the Levi-Civita symbol. It don't think it's something that have been implemented in `numpy.einsum`. as @B. M. said, It is exactly equivalent to the det of `np.dstack(A, B, C)`. – tsvikas Feb 10 '17 at 14:21
  • A previous use of Levi-Civita in `einsum` http://stackoverflow.com/a/20910319/901925. The new `optimize` might improve that performance. – hpaulj Feb 10 '17 at 16:38

3 Answers3

4

It's just the determinant.

omega=det(dstack([a,b,c]))

But it is slower....

An other equivalent solution is omega=dot(a,cross(b,c)).sum(1) .

But I think you have to compute about 9 (for cross) + 3 (for dot) + 2 (for sum) = 14 operations for each det, so it seems to be near optimal. At best you will win a two factor in numpy.

EDIT:

If speed is critical, you must go at low level. numba is a easy way to do that for a 15X factor here :

from numba import njit

@njit
def multidet(a,b,c):
    n=a.shape[0]
    d=np.empty(n)
    for i in range(n):
        u,v,w=a[i],b[i],c[i]
        d[i]=\
        u[0]*(v[1]*w[2]-v[2]*w[1])+\
        u[1]*(v[2]*w[0]-v[0]*w[2])+\
        u[2]*(v[0]*w[1]-v[1]*w[0])  # 14 operations / det
    return d

some tests:

In [155]: %timeit multidet(a,b,c)
100000 loops, best of 3: 7.79 µs per loop

In [156]: %timeit numpy.einsum('ij, ij->i', a, numpy.cross(b, c))
10000 loops, best of 3: 114 µs per loop


In [159]: allclose(multidet(a,b,c),omega)
Out[159]: True
B. M.
  • 18,243
  • 2
  • 35
  • 54
4

I've done a comparison of the methods mentioned in the answers. The results:

enter image description here

@Divakar's beats the einsum-cross one by a bit.

For the sake of completeness, let me note that there is another method solely relying on dot-products and a sqrt, see here. That method is slightly slower than both einsum-cross and slice-sum.


The plot was created with perfplot,

import numpy as np
import perfplot


def einsum_cross(a, b, c):
    return np.einsum("ij, ij->i", a, np.cross(b, c))


def det(a, b, c):
    return np.linalg.det(np.dstack([a, b, c]))


def slice_sum(a, b, c):
    c0 = b[:, 1] * c[:, 2] - b[:, 2] * c[:, 1]
    c1 = b[:, 2] * c[:, 0] - b[:, 0] * c[:, 2]
    c2 = b[:, 0] * c[:, 1] - b[:, 1] * c[:, 0]
    return a[:, 0] * c0 + a[:, 1] * c1 + a[:, 2] * c2


b = perfplot.bench(
    setup=lambda n: (
        np.random.rand(n, 3),
        np.random.rand(n, 3),
        np.random.rand(n, 3),
    ),
    n_range=[2**k for k in range(1, 20)],
    kernels=[einsum_cross, det, slice_sum],
)
b.save("out.png")
b.show()
Nico Schlömer
  • 53,797
  • 27
  • 201
  • 249
2

Here's one approach making use of slicing and summing -

def slicing_summing(a,b,c):
    c0 = b[:,1]*c[:,2] - b[:,2]*c[:,1]
    c1 = b[:,2]*c[:,0] - b[:,0]*c[:,2]
    c2 = b[:,0]*c[:,1] - b[:,1]*c[:,0]
    return a[:,0]*c0 + a[:,1]*c1 + a[:,2]*c2

We can replace the first three steps that compute c0, c1, c2 and its stacked version with a one-liner, like so -

b[:,[1,2,0]]*c[:,[2,0,1]] - b[:,[2,0,1]]*c[:,[1,2,0]]

This would create another (n,3) array, which has to used with a for sum-reduction resulting in a (n,) shaped array. With the proposed slicing_summing method, we are directly getting to that (n,) shaped array with summing of those three slices and thus avoiding that intermediate (n,3) array.

Sample run -

In [86]: # Setup inputs    
    ...: n = 871
    ...: a = np.random.rand(n, 3)
    ...: b = np.random.rand(n, 3)
    ...: c = np.random.rand(n, 3)
    ...: 

In [87]: # Original approach
    ...: omega = np.einsum('ij, ij->i', a, np.cross(b, c))

In [88]: np.allclose(omega, slicing_summing(a,b,c))
Out[88]: True

Runtime test -

In [90]: %timeit np.einsum('ij, ij->i', a, np.cross(b, c))
10000 loops, best of 3: 84.6 µs per loop

In [91]: %timeit slicing_summing(a,b,c)
1000 loops, best of 3: 63 µs per loop
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • Can you elaborate on how to implement this using the "one-liner" version? I'm not sure what you mean by "sum-reduction" in this case. Is there some clever trick to multiply the columns of two arrays pairwise, and sum the results? – shadowtalker Feb 22 '23 at 06:00