I have some algorithm, which is taking more than 5 hours to give the output. Is there any way to reduce the processing time?

Question

I have one deterministic dynamic programming algorithm which consist of a recursive function and my code is taking forever (more than 5 hours) to give me an output as I increase the data points (x and s in below code).

I heard there is something called parallel computing using a multiprocessing module in python, but I am not sure whether that will work for my problem and if yes, I am not aware about it at all.

import time
start_time = time.time()
from openpyxl import load_workbook
import pandas as pd
import numbers

wb=load_workbook(filename="data.xlsx", data_only=True)
ws=wb['Sheet1']

#for 1000 step size
x=806
s=1001
n=24

P=[0 for k in range(n)] 
for k in range(n):
    P[k]=ws.cell(row=k+2, column=2).value

X=[0 for j in range(x)] 
for j in range(x):
    X[j]=ws.cell(row=j+2, column=3).value

S=[0 for i in range(s)]
for i in range(s):
    S[i]=ws.cell(row=i+2, column=4).value

Sin=100
Sout=100

F=[[0 for j in range(x)] for i in range(s)]

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ n=23 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
class c:
    def abc1(self):
        self.df_output1 = pd.DataFrame()
        for count, k in enumerate(range(n)):
            for i in range(s):
                for j in range(x):
                    if k==n-1:
                        if (S[i]+X[j])==Sin:
                            F[i][j]=-X[j]*P[k]
                        else:
                            F[i][j]="NA"   
        self.Fbar=list()
        self.Xbar=list()  
        for f in F:
            try:
                FFF=max([x for x in f if isinstance(x, numbers.Number)])
                XXX=X[f.index(max([x for x in f if isinstance(x, numbers.Number)]))]
                self.Fbar.append(FFF)
                self.Xbar.append(XXX)
            except ValueError:
                FFF="NA"
                self.Fbar.append(FFF)
                self.Xbar.append(FFF)
        self.df_output1["n="+str(k).format(k)] = self.Xbar 

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 22>=n>=1 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
    def abc2(self):
        list2=(list(range(n))[::-1][1:n-1])
        self.df_output2 = pd.DataFrame()
        for count, k in enumerate(list2):
            for i in range(s):
                for j in range(x):
                    try:
                        if max(S)>=(S[i]+X[j])>=min(S):
                            FFFFF=S[i]+X[j]
                            F[i][j]=-X[j]*P[k]+dict(zip(S,self.Fbar))[FFFFF]
                        if max(S)<(S[i]+X[j])<min(S):
                            F[i][j]="NA"
                    except TypeError:
                        F[i][j]="NA"         
            self.Fbar=list()
            self.Xbar=list()
            for f in F:
                try:
                    FFF=max([x for x in f if isinstance(x, numbers.Number)])
                    XXX=X[f.index(max([x for x in f if isinstance(x, numbers.Number)]))]
                    self.Fbar.append(FFF)
                    self.Xbar.append(XXX)
                except ValueError:
                    FFF="NA"
                    self.Fbar.append(FFF)
                    self.Xbar.append(FFF)
            self.df_output2["n="+str(k).format(k)] = self.Xbar                

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ n=0 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@   
    def abc3(self):
        self.df_output3 = pd.DataFrame()
        for count, k in enumerate(range(n)):
            if k==0:
                for i in range(s):
                    for j in range(x):
                        if S[i]==Sin and max(S)>=(S[i]+X[j])>=min(S):
                            FFFFF=(S[i]+X[j])
                            F[i][j]=-X[j]*P[k]+dict(zip(S,self.Fbar))[FFFFF]
                        else:
                            F[i][j]="NA"   
                self.Fbar=list()
                self.Xbar=list()  
                for f in F:
                    try:
                        FFF=max([x for x in f if isinstance(x, numbers.Number)])
                        XXX=X[f.index(max([x for x in f if isinstance(x, numbers.Number)]))]
                        self.Fbar.append(FFF)
                        self.Xbar.append(XXX)
                    except ValueError:
                        FFF="NA"
                        self.Fbar.append(FFF)
                        self.Xbar.append(FFF)
                self.df_output3["n="+str(k).format(k)] = self.Xbar

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@   

    def abc4(self):
        writer = pd.ExcelWriter('output.xlsx', engine='xlsxwriter')
        self.df_output1.to_excel(writer, sheet_name='Sheet1', startcol=0, header=True, index=False)
        self.df_output2.to_excel(writer, sheet_name='Sheet1', startcol=1, header=True, index=False)
        self.df_output3.to_excel(writer, sheet_name='Sheet1', startcol=n-1, header=True, index=False)
        writer.save()

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

    def abc5(self):
        wb=load_workbook(filename="output.xlsx", data_only=True)
        ws=wb['Sheet1']
        X=[[0 for i in range(s)] for k in range(n)]
        Xlist=list()
        Slist=list()
        Plist=list()
        for k in range(n):
            for i in range(s):
                X[k][i]=ws.cell(column=24-k, row=i+2).value

            if k==0:
                Xstar=max([x for x in X[k] if isinstance(x, numbers.Number)])
                Sstar=Sin+Xstar
                Gain=-Xstar*P[k]
                Xlist.append(Xstar)
                Slist.append(Sstar)
                Plist.append(Gain)
            else:
                Xstar=X[k][S.index(Sstar)]
                Sstar=Sstar+Xstar
                Gain=-Xstar*P[k]
                Xlist.append(Xstar)
                Slist.append(Sstar)
                Plist.append(Gain)
        print("Profit:",sum(Plist))
foo=c()              
foo.abc1()
foo.abc2()
foo.abc3()
foo.abc4()
foo.abc5()
print("--- %s seconds ---" % (time.time() - start_time)) 

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

It would be great if somebody could help me understand how to reduce the processing time using multiprocessing module or any other way in python. TIA :)

Step 1 in optimization is to profile your program. What parts are taking the most time? Step 2 is figuring out why that part is takings such a long time. Step 3 is finding a solution. Multiprocessing is not a magical device to speed up a program; you have to reformat your code so it can be executed in parallel, which greatly depends on your problem. It'll also introduce other problems you have to think about (such as data races, etc.). — Ted Klein Bergman, Jul 07 '19 at 02:07
For example, you seem to create many lists using nested for-loops... Are they all necessary? Can you pre-allocate some and reuse them, instead of creating new ones? — Ted Klein Bergman, Jul 07 '19 at 02:10
Hey thanks for your quick reply. Yes I am creating many lists and they all are getting generated from their predecessor list (for ex. the lists which are generated in function abc1 leads to creating lists in function abc2 and then abc3). I don't think I can just pre-allocate any of them, thanks. — Saurabh, Jul 07 '19 at 02:19
The only part which is taking too long is up to function abc3, after that the function abc4 is just writing all the lists as dataframes in excel file and abc5 is just for basic mathematical operations which is taking just few seconds to give the output. — Saurabh, Jul 07 '19 at 02:23
If you have working code that you're looking to improve or optimize, your question should be asked on [codereview.se] - that's precisely why the site was created. This site is for questions about code that isn't working. — Ken White, Jul 07 '19 at 04:23
I'm voting to close this question as off-topic because (as noted), this is not a question that’s in scope for Stack Overflow but belongs instead possibly at the Code Review site — sideshowbarker, Jul 07 '19 at 05:11

user3666197 · Answer 1 · 2019-07-07T16:33:39.257

Q: Is there any way to reduce the processing time?
Well, actually not in a pure-[SERIAL] process-flow
and definitely not just by launching a few multiprocessing processes

The python multiprocessing module is known ( and the joblib does the same ) to:

The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads.

Yet, as everything in our Universe, this comes at cost:

Each, indeed each and every of the multiprocessing spawned sub-process is first instantiated ( after an adequate add-on latency due to O/S has to handle such new process and its new RAM-allocations and add it into the process-scheduling management steps ) as a ---FULL-COPY--- of the ecosystem present inside the original python process ( the complete python interpreter + all its import-ed modules + all its internal state and data-structures - used or not - ) so indeed huge amounts of RAM-allocations take place ( this may soon throw the platform's O/S to have to start RAM-SWAPPING once these cease to fit in-RAM and that makes devastating performance effects if trying to populate more sub-processes )

So far so good, we have paid some ( often reasonably negligible amount, if compared to your 5 hours ) time to spawn a process, but instead of about 100 [ns] RAM access-times, your swapping RAM will spend about 100.000x longer times to access any data inside the n-copies of all sub-processes in a "just"-[CONCURRENT] pool.

Your processing, as you confirmed, is a pure-[SERIAL], one step after another, so there is zero chance to get things done "faster" if using even the best concurrency tools and not making any concurrency-adoption strategy error.

So, as a result, you pay sort of "acceptable" costs for many sub-process-instantiations, plus you pay extreme costs of devastated performance, due to a factor of +10k x slower access-times to your many-times replicated the same data ( you now access in a one after another fashion ) and having paid all this, there is no benefit of using all the multiprocessing tools, but for still pure-[SERIAL] process-flow?

That does not make any sense.

You simply pay way more than you receive in exchange for doing that.

There are more details on economy of add-on costs for parallel-computing in Amdahl's law re-formulation (here).

In case you can propose an efficient algorithm re-factoring, there may become chances to benefit from some newly possible concurrency to be introduced due a possibly more efficient process-re-arrangement, but not in a pure-[SERIAL] sequence of steps ( one-after-another-after-another-... )

Finally, 5 hours is no problem. There are numerical problems, that take days, if not weeks to get processed, even after efficiently parallelised, so a sequentially ordered sequence of a list-iterator ordered sequential transformations could benefit but from cpython-compilation or numba-LLVM-compilation ( if lists are re-factored so as to let them get within a numba.jit()-compiler capabilities restrictions )

Q: Is there any way to reduce the processing time in any other way in python?

Oh yes, there is. Analyze the code-execution flow and debuggers ( here, after your sharpening of the focus to .abc3() Class-method yet the same rule applies in all other, similarly coded methods above, even a naked eye can ) can tell you, where you spend most of the code-execution time and where a re-factoring may help most, in improving the throughput of the computation performance.

Original code was to be noted to be principally flawed doing void or even duplicate operations in a row after having made the same rather expensive block of operations in several places and awfully poor and inefficient inside the triple-Hell with about 806.000.000 LOOP-REPETITIONS ( known for being the slowest python for-loopings )

############################################################################
# RE-FACTORIN REMARKS - REF. APPROXIMATE SCALES OF MAGNITUDES ( RAM ALLOCS + CPU PROCESSING ):
#                                               OF WASTED TIME & RESOURCES
# x =  806
# s = 1001
# n =   24
# P ~ [0:23]
# X ~ [0:806]
# S ~ [0:1000]
# F ~ [0:1000,0:805]
def abc3_faster( self ):
    """                                      __doc__ [DOC-ME]..."""
    self.df_output3 = pd.DataFrame()
    #--------------------------------------- 1x -------- k == 0
    max__S = max( S )           #----------- 1x static min( S[:1000] ) !806k times re-eval'd
    min__S = min( S )           #----------- 1x static min( S[:1000] ) !806k times re-eval'd
    all_NA = [ "NA" for j in range( x ) ] #- 1x static [:806]          !many times re-eval'd
    DictZIP= dict( zip( S, self.Fbar ) ) #-- 1x static [:1000]         ! 19M times re-eval'd
    ####################################################################################################
    # STATIC                                                                                           
    # CASE: k==0 --------------------------------------------------------------------------------<-*-<-*
    for        i in range( s ): #---------------LOOP 1001x                                         ^   ^
        pass;F[i] = all_NA  #---------------pre-fill------806x per-item looping avoided            ^   ^
        if ( S[i] == Sin ): #-------------------fill-if------------item conditions met             ^   ^
            for j in range( x ): #--------------LOOP 1001x806x                                     ^   ^
                FFFFF   = ( S[i] + X[j] ) #-------- PRE-COMPUTE / REUSE ..................         ^   ^
                if ( #                S[i] == Sin ~ PRE-TESTED ABOVE                      :        ^   ^
                   #nd max( S )  >= ( S[i] + X[j] ) >= min( S )                           :        ^   ^
                       max__S    >= ( FFFFF       ) >= min__S # MOST OF THE TIME IT WILL..:        ^   ^
                       ):                                                                 :        ^   ^
                    #FFFF   = ( S[i] + X[j] )     # PRE-COMPUTED ABOVE                    :        ^   ^
                    #[i][j] = (       -X[j] * P[k]# ENFORCED BY if k == 0:----------------:--------!   ^
                    F[i][j] = (       -X[j] * P[0]#    thus        k == 0 <---------------:--------!   ^
                              # dict( zip( S, self.Fbar )                                 :        v   ^
                              #       )[FFFFF]                                            :        v   ^
                              + DictZIP[FFFFF]    # <--------------------------------------        v   ^
                                )                 #                                                v   ^
                #-------------------------pre-fill-ed already above                                v   ^
                #lse:                     ### before [j]-LOOP-ing                                  v   ^
                #   F[i][j] = "NA"                                                                 v   ^
    #--------------------------------------------------------------------------------------------<-v   ^
    #or count, k in enumerate( range(    n ) ): #----- LOOP 24x ----- NEVER USED VALUES------------v---^
    #   if k==0:                                #----- LOOP 23x NOP-- k==0 DONE ABOVE <----------<-*---^
    #      ....THIS_BODY_OF_THE_ONLY_MEANINGFUL_WOK_REFACTORED.........................................*
    #-----------------------------------------------------------                                       v
    self.Fbar = list()                # blank-ed by empty list()                                       v
    self.Xbar = list()                # blank-ed by empty list()                                       v
    #-------------------------------------------------------------------------------------#            v
    for f in F: #-------------------------------LOOP-1001x                                #            v
        try:                                                                              #            v
            FFF =            max( [ x for x in f if isinstance( x, numbers.Number ) ] )   #            v
            #XX = X[f.index( max( [ x for x in f if isinstance( x, numbers.Number ) ] ) ) ]            v
            #----------------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^----            v
            #                A STRAIGHT DUPLICATE OF THE SAME WORK? WASTES TIME & RESOURCES            v
            ###############################################################################            v
            self.Fbar.append( FFF )                                                       #            v
            #elf.Xbar.append( XXX )                                                       #            v
            self.Xbar.append( X[f.index( FFF )] )                                         #            v
        except ValueError:                                                                #            v
            self.Fbar.append( "NA" )                                                      #            v
            self.Xbar.append( "NA" )                                                      #            v
            #                                                                             #            v
    #elf.df_output3["n="+str(k).format(k)] = self.Xbar # str( 0 ).format( 0 ) # a void operation       v
    self.df_output3["n=0"] = self.Xbar # <-------------------------------------------------------------v

BONUS - A wrap up notice:

Best avoid using the whole infrastructure of classes and their respective Class-methods, if just doing a pure-[SERIAL] sequence of their respective calls foo.m1();foo.m2();foo.m3();foo.m4();foo.m5().

You pay expenses for nothing received in return.

Next, learn and refactor the whole pure-[SERIAL] pipeline processing into numpy vector/matrix and a way smarter, built-in numpy.NaN processing. Avoiding a naive ,list()-based number representation will further increase your achievable performance by more than a factor of +4x just by smart numpy vectorisations and last but not least, numpy is safe for accelerations by numba-compilations, while the list()-s may cause problems, being rejected by some earlier numba LLVM-preprocessor versions

No other magic is anywhere near to get used.

Too bad, great answers are always posted to off-topic questions which are going to be deleted. — Munim Munna, Jul 07 '19 at 05:20
@MunimMunna :) feel free to reward the great answer and to help change the policy by proposing to protect the content from deletion. Moderators are those people who were granted extended responsibilities right for protecting The Community values that are in some sense valuable. Ask 'em to act :o) and enjoy the day you have started the better future to actually start to happen — user3666197, Jul 07 '19 at 05:24

I have some algorithm, which is taking more than 5 hours to give the output. Is there any way to reduce the processing time?

1 Answers1

Q: Is there any way to reduce the processing time?
Well, actually not in a pure-`[SERIAL]` process-flow
and definitely not just by launching a few `multiprocessing` processes

Q: Is there any way to reduce the processing time in any other way in python?

BONUS - A wrap up notice:

I have some algorithm, which is taking more than 5 hours to give the output. Is there any way to reduce the processing time?

1 Answers1

Q: Is there any way to reduce the processing time?Well, actually not in a pure-[SERIAL] process-flowand definitely not just by launching a few multiprocessing processes

Q: Is there any way to reduce the processing time in any other way in python?

BONUS - A wrap up notice:

Q: Is there any way to reduce the processing time?
Well, actually not in a pure-`[SERIAL]` process-flow
and definitely not just by launching a few `multiprocessing` processes