3

A have a problem with the speed of the program written in Python. The program is "Simulation of ferromagnetic particles in a magnetic field", and more specifically in the magnetically inert liquid. The program works, but very slowly compared to the same program written in C++, but I have it written in Python, on a project to study.

Overall, the program code is based on loops, with lots of floating point calculations. There is a random number of particles (generated randomly position) which interact with each other under the influence of a magnetic field.

This is the initial position:

https://i.stack.imgur.com/T15Bb.jpg

And the final:

https://i.stack.imgur.com/0nU5D.jpg

The main loop (in SymMain.py, with k variable) iterations are time steps are calculated coordinates of the particles in them at the time and the forces acting on it (the attractive force and a small repulsive force). In order to speed up, I wanted to use parallel processing to calculate the number of iterations at the same time, but this is not possible, because the calculation in one iteration depend on the calculations in the previous one.

I did not know that Python is so slower than C + +. As an example, the simulation of 529 particles in one time step is calculated (on my computer):

C + + ~ 0.5s

Python ~ 50s

So, how to speed up the program? Please help.

This code is only calculate, not drawing particles as in the pictures. The number of simulate partcile is on SymConst.py, this is nrH*nrL.

SymMain.py

#coding:windows-1250

from os import system
from SymCalc import *
from SymParticle import *


if __name__ == '__main__':
    App = SymCalc()
    App.MainLoop()

SymParticle.py

#coding:windows-1250

from random import randint
from math import *
from SymConst import *
from SymParticle import *

class SymCalc(object):
    def __init__(self):
        # declaration lists containing the properties of the particles
        ParticleList = []
        ParticleListTemp = []
        t = 0.0

        # the initial values ​​of particle
        for x in range(0, nParticle):
            ParticleList.append(Particle(x+1, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 8e-6))
            ParticleListTemp.append(Particle(x+1, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 8e-6))

        # generating random coordinates X, Y 
        for x in range(0, nParticle):
            self.Rand(ParticleList[x], x)

        # time steps
        for k in range(0, k0):
            print "Time step = {0}".format(k)

            # calculation of forces
            for i in range(0, nParticle):
                for j in range(0, i-1):
                    self.ParticleCalculate(ParticleList[i], ParticleList[j], 1)
                for j in range(i+1, nParticle):
                    self.ParticleCalculate(ParticleList[i], ParticleList[j], 1)

            # display data
            if(k%Constant == 0):
                for i in range(0, nParticle):
                    self.Print(k, i, ParticleList[i], t)

            # changing the position of the particle
            for i in range(0, nParticle):
                self.ChangeOfPosition (ParticleList[i], dt)

                # reset forces
                for j in range(0, nParticle):
                    self.ParticleCalculate(ParticleList[i], ParticleList[j], 0)
                    self.ParticleCalculate(ParticleList[i], ParticleListTemp[j], 0)

            # next time step
            t += dt


    # random coordinates of the particles
    def Rand(self, part, lp):
        l = lp%nrL    # współrzędna X pola      
        part.x = (l-nrL/2)*(L0/nrL) + ((randint(0,32767)%100)-50)*(L0/nrL-2*part.r)/99    # X

        h = (lp+1-(lp)%nrL)/nrL    # Y      
        part.y = (h-nrH/2)*(H0/nrH) + ((randint(0,32767)%100)-50)*(H0/nrH-2*part.r)/99    # współrzędna Y cząstki


    # calculating function of the force acting on the particles
    def ParticleCalculate(self, part, part_tmp, p):
        # auxiliary variables
        # r_4, dx, dy, rr, sum, sx, sy, chi_eff, tmp, frepx, frepy, f0
        md = []

        if(p == 0):
            part.frx = 0
            part.fry = 0
            part.fx = 0
            part.fy = 0

        if(p == 1):
            # versor coordinates connecting the geometrical center of the particle
            dx = part.x - part_tmp.x
            dy = part.y - part_tmp.y

            # the distance between the geometric means of the particle
            rr = sqrt(dx*dx + dy*dy)

            if(rr < 0.85*(part.r + part_tmp.r)):
               print "ERROR: Invalid distance between the particles! Simulation aborted..."

            if(rr >= 10*part.r):
                # magnetic dipoles
                chi_eff = (3.*(MI_P - 1.))/((MI_P - 1) + 3.)

                md.append((4.*MI_0*pi*part.r*part.r*part.r*chi_eff*M_H0)/3.0)
                md.append((4.*MI_0*pi*part_tmp.r*part_tmp.r*part_tmp.r*chi_eff*M_H0)/3.0)

                tmp = pow(rr,7)

                # first member
                sum = (5.*(md[0]*part.nx*dx + md[0]*part.ny*dy) * (md[1]*part_tmp.nx*dx + md[1]*part_tmp.ny*dy)) / tmp
                sx = sum*dx
                sy = sum*dy
                tmp = tmp / (rr*rr)

                # second member
                sum = (md[0]*part.nx*md[1]*part_tmp.nx + md[0]*part.ny*md[1]*part_tmp.ny) / tmp
                sx -= sum*dx
                sy -= sum*dy

                # third member
                sx -= (md[0]*(md[1]*part_tmp.nx*dx + md[1]*part_tmp.ny*dy)*part.nx + md[1]*(md[0]*part.nx*dx + md[0]*part.ny*dy)*part_tmp.nx)/tmp
                sy -= (md[0]*(md[1]*part_tmp.nx*dx + md[1]*part_tmp.ny*dy)*part.ny + md[1]*(md[0]*part.nx*dx + md[0]*part.ny*dy)*part_tmp.ny)/tmp

                # finally
                tmp = (-3./(4*pi*MI_0))
                sx *= tmp
                sy *= tmp
                part.fx += sx
                part.fy += sy

                # short-range repulsive force
                tmp = pow(rr,15)
                r_4 = pow((part.r+part.r),4)

                f0 = 3.*fabs(md[0])*fabs(md[1])/(4.*pi*MI_0*r_4)
                frepx = pow((part.r+part.r),15)*dx/(tmp*rr)*f0
                frepy = pow((part.r+part.r),15)*dy/(tmp*rr)*f0

                part.frx += frepx;
                part.fry += frepy;


    # change the position of the particle
    def ChangeOfPosition(self, part, dt):
        part.ddx = 0
        part.ddy = 0

        part.ddx = 1/(6*pi*part.r*eta)*(part.fx+part.frx)*dt
        part.ddy = 1/(6*pi*part.r*eta)*(part.fy+part.fry)*dt

        # particles new position value
        part.x += part.ddx
        part.y += part.ddy

        if(part.x < -L0/2):
            part.x += L0
        elif(part.x > L0/2):
            part.x -= L0
        elif(part.y < -H0/2):
            part.y += H0
        elif(part.y > H0/2):
            part.y -= H0



    # display data
    def Print(self, k, i, part, t):
        print "-"*50
        print "\nParticle {0}".format(i+1)
        print "The resultant magnetostatic force fx = {0}".format(part.fx - part.frx)
        print "The resultant magnetostatic force fy = {0}\n".format(part.fy - part.fry)
        if(i == nParticle-1):
            print "\n\t\t---t={0}[s]---".format(t)
        print "-"*50

SymParticle.py

#coding:windows-1250

class Particle(object):
    # generating a particle properties
    def __init__(self, num, x, y, fx, fy, frx, fry, nx, ny, ddx, ddy, r):
        self.num = num     
        self.x = x     
        self.y = y     
        self.fx = fx     
        self.fy = fy     
        self.frx = frx     
        self.fry = fry     
        self.nx = nx     
        self.ny = ny     
        self.ddx = ddx     
        self.ddy = ddy     
        self.r = r     

SymConst.py

#coding:windows-1250

### Constant
M_H0 = 3e4
MI_0 = 12.56e-7     
MI_P = 2000
eta = 0.1           
k0 = 10001
dt = 1e-6           # time step      
H0 = 5.95e-4
L0 = 5.95e-4
nrH = 4
nrL = 4
Constant = 100
nParticle = nrH*nrL    # number of particle
Deduplicator
  • 44,692
  • 7
  • 66
  • 118
Kamilos
  • 167
  • 1
  • 2
  • 10
  • 4
    If you're doing heavy numerical work, you basically have to use `numpy`. – senshin Jan 22 '14 at 23:35
  • 4
    What @senshin said. Python is extremely slow compared to a language such as C, especially if you have lot of nested loops or function calls. Writing numpy code will give C-like performance or better, but requires some fundamental adaptations to the way you think about numerical calculations (you operate on entire arrays, not individual elements). Learning numpy is not a 5-minute task, but it's definitely worthwhile. [Here's the official tutorial](http://wiki.scipy.org/Tentative_NumPy_Tutorial) – loopbackbee Jan 22 '14 at 23:39
  • also try to pay attention to code style for example read [here](http://www.python.org/dev/peps/pep-0008/#function-names) – Foo Bar User Jan 22 '14 at 23:39
  • Thanks for answers, I thought about numpy, but do not have much time, what do you think about the processing part of the calculation from a DLL file written in C + + ? How hard is it ? – Kamilos Jan 22 '14 at 23:47
  • @Kamilos If you already know how to create DLLs and use them in python it may be a good idea, but otherwise I'd go for numpy, or pure C/++. Python/C interfacing is not as easy as it could be. You may want to check [cython](http://cython.org/) and [pypy](http://pypy.org/) too. Pypy can give you significant speed improvements without a single change to your code – loopbackbee Jan 22 '14 at 23:56

1 Answers1

5

TL;DR - Use PyPy!

To investigate this problem, first use the Python cProfile module to profile your code and find out where all of the time is being used (with print statements commented out):

$ python -m cProfile -s time SymMain.py 
         30264977 function calls in 34.912 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  7370737   27.547    0.000   30.501    0.000 SymCalc.py:62(ParticleCalculate)
        1    3.626    3.626   34.909   34.909 SymCalc.py:8(__init__)
 11101110    1.715    0.000    1.715    0.000 {math.pow}
   160016    0.502    0.000    0.502    0.000 SymCalc.py:128(ChangeOfPosition)
  4440476    0.498    0.000    0.498    0.000 {method 'append' of 'list' objects}
  4440444    0.454    0.000    0.454    0.000 {math.fabs}
  2250226    0.287    0.000    0.287    0.000 {math.sqrt}
   500154    0.280    0.000    0.280    0.000 {range}
        1    0.001    0.001    0.002    0.002 random.py:40(<module>)
        1    0.001    0.001    0.001    0.001 hashlib.py:55(<module>)
        1    0.001    0.001    0.003    0.003 SymCalc.py:2(<module>)
     1616    0.000    0.000    0.000    0.000 SymCalc.py:151(Print)
        1    0.000    0.000   34.912   34.912 SymMain.py:3(<module>)
       32    0.000    0.000    0.000    0.000 SymParticle.py:4(__init__)
---8<--- snip ---8<---

In your case, most of the time is being spent in the SymCalc.py:62(ParticleCalculate) function. Looking at that function it seems to be largely memory access and calculations. This is a good case for looking at PyPy!

PyPy is a fast, compliant alternative implementation of the Python language (2.7.3 and 3.2.3). It has several advantages and distinct features:

  • Speed: thanks to its Just-in-Time compiler, Python programs often run faster on PyPy. (What is a JIT compiler?)
  • Memory usage: large, memory-hungry Python programs might end up taking less space than they do in CPython.
  • Compatibility: PyPy is highly compatible with existing python code. It supports cffi and can run popular python libraries like twisted and django.
  • Sandboxing: PyPy provides the ability to run untrusted code in a fully secure way.
  • Stackless: PyPy comes by default with support for stackless mode, providing micro-threads for massive concurrency.
  • As well as other features.
$ time pypy SymMain.py 
real    0m1.071s
user    0m1.025s
sys 0m0.042s

1.071s - Much better! Compare that to the original execution time on my system:

$ time python SymMain.py 

real    0m29.766s
user    0m29.646s
sys 0m0.103s
Peter Gibson
  • 19,086
  • 7
  • 60
  • 64
  • Do you happen to have [numba](http://numba.pydata.org/) installed in your python? I would be interested to see the speed results if you `autojit` onto the `ParticulateCalculate` function. – Caleb Hattingh Jan 23 '14 at 00:18
  • @cjrh I'm not very familiar with numba, but when I tried it I get an error `NotImplementedError: Unable to cast from { i64, i8* }* to { i64, i8* }.` – Peter Gibson Jan 23 '14 at 00:30
  • I get the same. I appreciate that you even tried it. I'm looking at it a bit more. – Caleb Hattingh Jan 23 '14 at 00:43
  • Ok, I got it working with `@autojit`. The solution was to also autojit the `Particle` class. Unfortunately, the speed of the numba autojit solution was 3x slower than the _python_ version. This is no doubt because the support for classes in numba is a bit tricky to get working. – Caleb Hattingh Jan 23 '14 at 01:10
  • @cjrh yikes! Makes the numpy "zero effort" solution look even better :) – Peter Gibson Jan 23 '14 at 01:37
  • @Peter Gibson, thanks for cPython test. How can I use Pypy implementation to increase speed, when I created the GUI in wxPython ? – Kamilos Jan 23 '14 at 07:49
  • @Kamilos, I'm not sure if it's compatible, you might be best asking this as a new question – Peter Gibson Jan 23 '14 at 22:10
  • @Peter Gibson, I have already checked, Pypy module does not work with wxPython. So I have to do the calculation separately using Pypy, and visualize data with GUI using wxPython. Thanks for all :D – Kamilos Jan 23 '14 at 23:30