-1

I had prepared a project on making a software application. It is complete and working fine except that the speed of execution is very slow.. I have taken several chunks of code and optimized it..

I tried psyco.. ie I installed psyco and added two lines on the top of my code

import psyco
psyco.full()

Don't know whether this is the way using psyco.. if this is wrong. Please tell me how to use psyco.. because I added this and found no improvement..

I have tried profiling and I know the code lines taking time but these can't be further optimized and are unavoidable line of code..

I also thought of option of rewriting the code in 'c' using some python package.. but I always had a very bad experience in using additional package of python which are not part of basic python..

I am using python 2.6 and windows vista.. please kindly tell methods method for increasing the speed of execution of the whole code significantly.. at least 5x times.. please..

I haven't written my code in method, there few method in between thou.. there is no main..

Yes as few suggested my is an IO bound problem.. as I need to call the code some 500 times and this involves opening and closing of files of at least 2 per call..

And here when opening a .pm file, it has two columns and I need the first columns only, so I am copying the entire first columns into the list and passing it to a function to get its row number and then opening other file to get the elements of that row number into a list...

This is the task I wanted... I guess loading the elements of first columns into the list is time consuming any idea to rectify this..

How can I improve the performance for IO bound bottlenecks

Looking for help desperately

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
kaki
  • 1,043
  • 5
  • 14
  • 20
  • You mention that you know which lines are performing poorly. Perhaps you could post those particular lines? I suspect there is no magic bullet that automatically improves performance for every program by five times. – Slartibartfast Jun 22 '10 at 05:10
  • You should post some code (the performance bottle-neck part for example) –  Jun 22 '10 at 05:11
  • 1
    Ditto everything above. Also, please reformat the question so it's easier to read and understand. I would if I could but I can't. – Sam Dolan Jun 22 '10 at 05:14
  • this is one part of the code..there is one more part which is also taking much time...can i post some 200 lines of code.. – kaki Jun 22 '10 at 05:24

6 Answers6

3

You could get a lot better performance if you could switch to binary file formats. Most of your code is doing parsing and string manipulation. You're doing a lot of converting strings to floats, which is slower than you think.

dmazzoni
  • 12,866
  • 4
  • 38
  • 34
  • please,put some code for my better understanding i am getting what u meant to say but as a newbie not able to make cod for these tried ur above suggestion also but it didnt make much difference coz i dunno how to make the min func make two comparision in a single list scan..and hence i made it return two list in a single call but in the function just repeated the same code twice hence it didnt make any improvement..kindly give some example code. – kaki Jun 22 '10 at 06:44
1

You are unlikely to see a 5x performance difference by just tweaking the code around.

First you should look at improving your algorithm - are you using the best datastructures for the job? Perhaps using a dict or a set in the right place can speed your code up a alot.

Writing a C module is not all that hard, and is another option if you can find no way to improve the Python code. Usually you would expect more than a 5x speed up by using C code.

Maybe your problem is IO bound. Then you need to look at ways to improve the performance of the IO

If you want more help here, you'll probably have to show some code or at least describe what your program does.

UPDATE: Looks like you are opening and closing lots of files which tends to be painfully slow on windows.

John La Rooy
  • 295,403
  • 53
  • 369
  • 502
0

psyco can be used as simple as import and call psyco.full(). so you are right about your psyco usage.

If you are trying to build a python module using C/C++, have a look at boost::python

You should really post your code for further analysis.

Jason
  • 807
  • 5
  • 15
0

To optimize your code for speed you simply have to profile it and see where the problem is. Guessing does not help. But once you know where, the most bang for your buck usually come from those in descending order: improving algorithm, using more appropriate data structures, removing resource bottlenecks (io,memory,cpu), reducing memory allocation, reducing context switching (processes and subroutines).

Jiri Klouda
  • 1,362
  • 12
  • 25
  • As for your specific example, there might be two angles. I'd look at the way you store your data in files and have to parse them. This could very well be IO problem. Another angle are the split and line.partition methods you use. Those are surprisingly slow. – Jiri Klouda Jun 22 '10 at 05:37
  • can you tell me any alternative for these split and partition to suit my task .please..i am a newbie and not aware of complete funtions – kaki Jun 22 '10 at 06:00
  • I'm sorry, don't really know python syntax or libraries - dreadful language - but in perl I got serious performance gains with using substr call instead. But I would go with what dmazzoni said, switch to binary formats and avoid all this string manipulation alltogether. – Jiri Klouda Jun 22 '10 at 23:55
0

Here's one opportunity for optimization: you're calling get_list twice, with very similar arguments:

join_cost_index_end[index] = get_list(file, float(abs1), fout)
join_cost_index_strt[index] = get_list(file, float(abs2), fout)

That means that most of the work in get_list is being done twice for no good reason. Rewrite it so that get_list is being called once, and have it return both index_end and index_strt at the same time.

dmazzoni
  • 12,866
  • 4
  • 38
  • 34
  • that will not make any difference until i change the get list function to scan thru the list only once...but i guess the min() i used cant be manupalated so that i can make to comparision at a single file scan and return min value for two variable..if it is possible please put some code so that i can understand kindly.. – kaki Jun 22 '10 at 06:41
0

why bot just try using cython? You should get much better performance without changing any of the code. With a little bit of modification this should help even more.

xyz-123
  • 2,227
  • 4
  • 21
  • 27
  • Yes this is what i wished for can you please be more specific on how to use and implement these and what are the little modifications...instead of just answering such vaguely...if u have used please tell me if anything need to be installed or how to use...thnq – kaki Jun 22 '10 at 07:03
  • 1
    I'm not here for doing your work, I gave you a hint and now search for cython and read the documentation. Cython is not that difficult to undertstand and use... – xyz-123 Jun 22 '10 at 11:40
  • I didnt ask u do my work...i just asked for suggestion.. how can u just say with little modification and leave it..without mentioning what modification IF U KNEW what are they...if u are not here to help other why be here then.. Dont just give such answer just to increase your reputation at SO... Give just one package out of some thousands available with giving any info about it..if each suggest one package as an answer...what would it be. I am not here to get into a word war.we are here to help each other..So "peace".raise your rep.score before i give answer "such as I am nt here to work fr u – kaki Jun 23 '10 at 06:11