8

Given a 2 dimensional array, where each row represents the position vector of a particle, how do I compute the mean square displacement efficiently (using FFT)? The mean square displacement is defined as

delta_sq

where r(m) is the position vector of row m, and N is the number of rows.

Community
  • 1
  • 1
thomasfermi
  • 1,183
  • 1
  • 10
  • 20

2 Answers2

20

The following straight forward method for the msd works, but it is O(N**2) (I adapted the code from this stackoverflow answer by user morningsun)

def msd_straight_forward(r):
    shifts = np.arange(len(r))
    msds = np.zeros(shifts.size)    

    for i, shift in enumerate(shifts):
        diffs = r[:-shift if shift else None] - r[shift:]
        sqdist = np.square(diffs).sum(axis=1)
        msds[i] = sqdist.mean()

    return msds

However, we can make this code way faster using the FFT. The following consideration and the resulting algorithm are from this paper, I will just show how to implement it in python. We can split the MSD in the following way

split_delta

Thereby, S_2(m) is just the autocorrelation of the position. Note that in some textbooks S_2(m) is denoted as autocorrelation (convention A) and in some S_2(m)*(N-m) is denoted as autocorrelation (convention B). By the Wiener–Khinchin theorem, the power spectral density (PSD) of a function is the Fourier transform of the autocorrelation. This means we can compute a PSD of a signal and Fourier-invert it, to get the autocorrelation (in convention B). For discrete signals we get the cyclic autocorrelation. However, by zero-padding the data, we can get the non-cyclic autocorrelation. The algorithm looks like this

def autocorrFFT(x):
  N=len(x)
  F = np.fft.fft(x, n=2*N)  #2*N because of zero-padding
  PSD = F * F.conjugate()
  res = np.fft.ifft(PSD)
  res= (res[:N]).real   #now we have the autocorrelation in convention B
  n=N*np.ones(N)-np.arange(0,N) #divide res(m) by (N-m)
  return res/n #this is the autocorrelation in convention A

For the term S_1(m), we exploit the fact, that a recursive relation for (N-m)*S_1(m) can be found (This is explained in this paper in section 4.2). We define

Define_D_Q

And find S_1(m) via

recursive

This yields the following code for the mean square displacement

def msd_fft(r):
  N=len(r)
  D=np.square(r).sum(axis=1) 
  D=np.append(D,0) 
  S2=sum([autocorrFFT(r[:, i]) for i in range(r.shape[1])])
  Q=2*D.sum()
  S1=np.zeros(N)
  for m in range(N):
      Q=Q-D[m-1]-D[N-m]
      S1[m]=Q/(N-m)
  return S1-2*S2

You can compare msd_straight_forward() and msd_fft() and will find that they yield the same results, though msd_fft() is way faster for large N

A small benchmark: Generate a trajectory with

r = np.cumsum(np.random.choice([-1., 0., 1.], size=(N, 3)), axis=0)

For N=100.000, we get

$ %timeit msd_straight_forward(r)
1 loops, best of 3: 2min 1s per loop

$ %timeit msd_fft(r)
10 loops, best of 3: 253 ms per loop
Community
  • 1
  • 1
thomasfermi
  • 1,183
  • 1
  • 10
  • 20
  • Very nice implementation. How would you modify `msd_fft()` to only compute the 100th first lag time for example ? – hadim Dec 11 '15 at 18:40
  • @HadiM You mean calculate `msd_fft(r)[:100]` more efficiently? – thomasfermi Dec 11 '15 at 18:45
  • Exactly and if it's possible ! – hadim Dec 11 '15 at 18:57
  • Mhh, sorry, I don't think it's possible to do it more efficiently (of course for S1 we can, but for S2, which is the main work, it seems impossible)... If you only need the MSD for the first 100 time steps, it might be faster to use the straight forward method and only go up to 100 in the loop. You should check for your data what is faster. – thomasfermi Dec 11 '15 at 19:46
  • Ok. Thank you. I will do more test then. For the record, I asked to trackpy devs if they are interested in your algorithm : https://github.com/soft-matter/trackpy/issues/335 – hadim Dec 11 '15 at 20:15
  • It's not really my algorithm. As mentioned in the answer, I took it from [this open access paper](http://www.neutron-sciences.org/articles/sfn/abs/2011/01/sfn201112010/sfn201112010.html) and just translated it to python/numpy :) – thomasfermi Dec 11 '15 at 20:22
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/97682/discussion-between-thomasfermi-and-hadim). – thomasfermi Dec 11 '15 at 20:42
  • 1
    [Here](https://stackoverflow.com/a/69767209/12131616) I rewrote this algorithm for several particles at the same time using only one function and avoiding the loop for `S1`, as mentioned in @Dominik Wille answer. – Puco4 Oct 29 '21 at 10:20
1

Using numpy.cumsum you can also avoid looping over range(N) in the S1 calculation:

sq = map(sum, map(np.square, r))
s1 = 2 * sum(sq) - np.cumsum(np.insert(sq[0:-1], 0, 0) + np.flip(np.append(sq[1:], 0), 0))