0

I am newbie in using MDAnalysis as well as python in general. I have performed a 100 ns simulation on a protein and wanted to calculate the survival probability of my water molecules within my selection. The thing is using the given code of MDAnalsyis survival probability, I am successfully generating the graph (x=time, y=SP) but cannot understand whether the time is in frames/ns/ps.

For trial purpose, i tried to run my last 1000 frames i.e. 2ns . The following is the code that i have used for this

import MDAnalysis
from MDAnalysis.analysis.waterdynamics import SurvivalProbability as SP
import numpy as np
import matplotlib.pyplot as plt
u = MDAnalysis.Universe("md.psf", "md1000f.dcd", in_memory= True)
select = "byres name OH2 and sphzone 3.5 (name N and resid 143 144 145)"
sp = SP(u, select, verbose=True)
sp.run(start=0, stop=1000, tau_max=20)
tau_timeseries = sp.tau_timeseries
sp_timeseries = sp.sp_timeseries

# print in console
for tau, sp in zip(tau_timeseries, sp_timeseries):
      print("{time} {sp}".format(time=tau, sp=sp))

Here, i wanted all the water molecules within a sphere radius of 3.5 Å forming H-bonds with N atoms of residue Ids 143, 144 and 145.

This is the result that i have got while running the code: 0 1 1 0.8738224637681159 2 0.8059945504087194 3 0.7505464480874317 4 0.7115068493150685 5 0.6736263736263736 6 0.6382920110192837 7 0.6041436464088398 8 0.573961218836565 9 0.5491666666666666 10 0.5242339832869081 11 0.5033519553072625 12 0.4809523809523809 13 0.45983146067415726 14 0.43999999999999995 15 0.4186440677966101 16 0.3985835694050991 17 0.37698863636363633 18 0.3566951566951567 19 0.33485714285714285 20 0.31432664756446993

My question is:

  1. we have already input the start and stop, why we still need a tau_max? Should it calculate the probability of water molecules present within that selection from 1 to 1000 frames (0 to 2ns)?

  2. What does tau_max=20 means? what is the unit of tau_max is it picosecond or nanosecond?

  3. What will be the optimal tau_max value for 100ns simulation?

Looking forward to your reply, thanks.

I have tried to increase the tau_max values ranging from 50 to 1000. Thinking that it is asking for number of frames. Since I have loaded 1000 frames, i have assigned tau_max as 1000. However, the values of SP is only ending at 56 tau number and from there onwards the values are either 0.0 or NaN.

1 Answers1

1

The analysis is agnostic to time and the time depends on the frames which here you use (start, stop, tau_max). Please recheck the documentation. In your code here, tau_max means analyse the survival probability within 20 frames stretch anywhere along the trajectory defined by frames 0 to 1000. The optimal value of tau_max depends on the problem: the temperature, the pocket shape, the diffusion rate, the strength of the interaction etc. Here your result shows that within 11 frames, only half of the water molecules survive.

mateuszb
  • 1,072
  • 13
  • 26
  • Thank you for explaining the logic behind tau_max. In that case, if I have to perform SP on water molecules for entire time series of 100ns i.e. 50,000 frames, then what should I put the value of tau_max in my code? So that the graph which I will generate using matplotlib has x-axis=0 to 100 ns and y-axis = SP values from 0 to 1... – Alvea Tasneem Jan 09 '23 at 09:16
  • I think you might want to revisit what SP means here. SP has only X axis that is very short, as you can see from your data 11 frames already removes half of the molecules. So if you did the x-axis (ie max_tau very large) it would have the value 0 after ~50 frames. You might want to try `sp.run(tau_max=50)` if you want to use the entire simulation. But if the water does not "survive" after 50 frames, it won't come back later, by definition. – mateuszb Jan 09 '23 at 10:40
  • Thank you for your reply. I am still not clear as to how can I calculate the SP of my water from 0 to 100ns. Are there any changes I need to do in my above code that can solve this issue? – Alvea Tasneem Jan 11 '23 at 11:12
  • By default the entire simulation is used with `sp.run(tau_max=50)`, where the two other parameters are used to apply it only on a chunk of it. – mateuszb Jan 11 '23 at 11:14