optimizing numpy vectorization on calculating distances and np.sum

Question

I have the following code:

# positions: np.ndarray of shape(N,d) 
# fitness: np.ndarray of shape(N,)
# mass: np.ndarray of shape(N,)

iteration = 1
while iteration <= maxiter:
    K = round((iteration-maxiter)*(N-1)/(1-maxiter) + 1)

    for i in range(N):
        displacement = positions[:K]-positions[i]
        dist = np.linalg.norm(displacement, axis=-1)
        if i<K:
            dist[i] = 1.0       # prevent 1/0

        force_i = (mass[:K]/dist)[:,np.newaxis]*displacement
        rand = np.random.rand(K,1)
        force[i] = np.sum(np.multiply(rand,force_i), axis=0)

So I have an array that stores the coordinates of N particles in d dimensions. I need to first calculate the euclidean distance between particle i and the first K particles, and then calculate the 'force' due to each of the K particles. Then, I need to sum over K particles to find the total force acting on particle i, and repeat for all N particles. It is only parts of the code, but after some profiling this part is the most time-critical step.

So my question is how I can optimize the above code. I have tried to vectorize it as much as possible, and I'm not sure if there is still room for improvement. The profiling results say that {method 'reduce' of 'numpy.ufunc' objects}, fromnumeric.py:1778(sum) and linalg.py:2103(norm) take the longest time to run. Is the first one die to array broadcasting? How can I optimize these three function calls?

This [answer](https://stackoverflow.com/a/52030534/8069403) is quite interesting. There is also the [`cdist`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html) function in scipy — xdze2, Sep 10 '18 at 13:45
`N` is usually between 50-100, and `d` is around 2-50, but it would be great if the optimization can still be applied even if `N` reaches ~2000 and `d` reaches ~100. — Physicist, Sep 10 '18 at 23:48

score 1 · Answer 1 · answered Sep 10 '18 at 13:50

I had to make some adjustments since your code was missing a few parts. But the first optimization would be to get rid of the for i in range(N) loop:

import numpy as np

np.random.seed(42)

N = 10
d = 3
maxiter = 50

positions = np.random.random((N, d))
force = np.random.random((N, d))
fitness = np.random.random(N)
mass = np.random.random(N)

iteration = 1
while iteration <= maxiter:
    K = round((iteration-maxiter)*(N-1)/(1-maxiter) + 1)

    displacement = positions[:K, None]-positions[None, :]
    dist = np.linalg.norm(displacement, axis=-1)
    dist[dist == 0] = 1

    force = np.sum((mass[:K, None, None]/dist[:,:,None])*displacement * np.random.rand(K,N,1), axis=0)
    iteration += 1

Other improvements would be to try faster implementations of the norm, such as scipy.cdist or numpy.einsum

Divakar · Accepted Answer · 2018-09-11T03:48:30.813

1

We would keep the loops, but try to optimize by pre-computing certain things -

from scipy.spatial.distance import cdist

iteration = 1
while iteration <= maxiter:
    K = round((iteration-maxiter)*(N-1)/(1-maxiter) + 1)

    posd = cdist(positions,positions)
    np.fill_diagonal(posd,1)
    rands = np.random.rand(N,K)
    s = rands*(mass[:K]/posd[:,:K])
    for i in range(N):
        displacement = positions[:K]-positions[i]
        force[i] = s[i].dot(displacement)

edited Sep 11 '18 at 03:48

answered Sep 10 '18 at 14:44

Divakar

218,885
19
262
358

should it be `s = rands*(mass[:K]/posd[:,:K])`? With this change, then it gives the same results and is indeed faster. Thanks. – Physicist Sep 11 '18 at 00:34

optimizing numpy vectorization on calculating distances and np.sum

2 Answers2