1

I have a trajectory file from simulation of 20,000 frames with 5 ps time in between every frame, what I want to do is to calculate diffusion in 2 dimension (x and y axis). but to calculate diffusion in 2D, first I have to calculate Mean square displacement of the molecule under study. MSD calculates the average time taken by molecule to explore the system in random walks.

I am very new to python programming and I would really want some help to get started this problem and to solve this problem. Hope to get positive response.

Life Sciences
  • 87
  • 1
  • 1
  • 7
  • 1
    Please explain what you've tried, post some code, and ask a precise programming related question. – Mel Jul 07 '15 at 09:32

1 Answers1

8

Well the MSD is exactly as it sounds it is the mean square displacement so what you need to do is find the difference in the position (r(t + dt) -r(t)) for each position and then square it and finally take the mean. First you must find r from x and y which is easy enough. I am going to assume you are using numpy from here on out.

    import numpy as np
    r = np.sqrt(xdata**2 + ydata**2)
    diff = np.diff(r) #this calculates r(t + dt) - r(t)
    diff_sq = diff**2
    MSD = np.mean(diff_sq)

Now this is the general way to calculate MSD then you can compare with things like Brownian motion where MSD = 4Dt approximately in 2 dimensions.

jfish003
  • 1,232
  • 8
  • 12
  • That is really helpful, but from where will I get the X and Y data, do I have to calculate X and Y data from "random walk" for 20,000 frames? def randwalk(x,y): theta=2*math.pi*rd.rand() x+=math.cos(theta); y+=math.sin(theta); return (x,y) – Life Sciences Jul 07 '15 at 11:10
  • or I will get the X and Y from my trajectory file, which is in dcd format ? or do I have to calculate X and Y ? – Life Sciences Jul 07 '15 at 12:18
  • 1
    Oh so you don't even have trajectories calculated let me work on that. Do you want just Brownian motion? Or do you want something else like super or sub-diffusive motion? – jfish003 Jul 07 '15 at 14:07
  • I have trajectory in dcd format, total frames are of 20,000 total time in ns 100. I am trying to extract coordinates from trajectory using DCD reader and MDAnalysis python package. I have to calculate lipid lateral diffuion .. the formula above will go with lipids as well.... – Life Sciences Jul 07 '15 at 14:38
  • import MDAnalysis.coordinates reader= MDAnalysis.coordinates.reader("/path to dcd file") reader for timestep in reader [0:10]: print timestep[0] len(u.trajectory) – Life Sciences Jul 07 '15 at 14:39
  • 1
    Once you extract the trajectory it should give you the x and y coordinates, I would hope in numpy array form and then you should be able to use the above code. – jfish003 Jul 07 '15 at 14:44
  • import MDAnalysis.coordinates reader= MDAnalysis.coordinates.reader("/path to dcd file") reader for timestep in reader [0:10]: print timestep[0] len(u.trajectory) – Life Sciences Jul 07 '15 at 14:47
  • I am not familiar with dcd but from the MDAnalysis web page it seems like you should be able to use the _pos keyword. See below: https://mdanalysis.googlecode.com/svn/trunk/doc/html/documentation_pages/coordinates/init.html#trajectory-api – jfish003 Jul 07 '15 at 14:49
  • have extracted the coordinates in a text file but for X Y and Z , and I have to calculate MSD and diffusion for x and Y axis ... – Life Sciences Jul 07 '15 at 14:54
  • this is the example of sample coordinates – Life Sciences Jul 07 '15 at 14:55
  • [[ -3.35503888 2.00754738 15.24140358] [ -3.23976159 0.71509355 15.98832035] [ -4.14608479 0.52223921 16.54310417] ..., [ -6.88356876 -4.43872595 -24.77352142] [ 1.81303966 -5.45238352 -32.72003937] [ 13.96470165 -43.02565002 -36.96560287]] [[ -3.02747536e+00 2.48863101e+00 1.50928030e+01] [ -2.80984759e+00 1.11831498e+00 1.56884098e+01] [ -3.72466898e+00 9.10387099e-01 1.62234497e+01] – Life Sciences Jul 07 '15 at 14:55
  • can you please let me know how this snippet is calculating difference like from where I will get the values for np.diff (r) .as r = np.sqrt (x data **2 + ydata**2) will calculate values for r at time intervals but how will I calculate diff.. kindly can you explain this snippet ?? ... – Life Sciences Jul 09 '15 at 21:42
  • since r = np.sqrt(xdata**2 + ydata**2) you will have an array of r values, for example r = np.array([1.2,3.4,1.08,-2,-2.14, ...] ), np.diff is a function that subtracts each value from the next value so np.diff(r) in my example would produce np.array([2.2, -2.32,-3.08,-0.14...]) – jfish003 Jul 10 '15 at 16:40
  • To draw MSD(t) plot, should I calculate np.diff for t=1 and t=2 and t=3 and so on? I mean what shiuld I do to draw MSD(t) plot by python for random walk? @jfish003 – Atefeh Rashidi Dec 26 '19 at 09:48