Python loop taking more time at each iteration

Question

I made a for loop which strangely increases in duration at each iteration although the amount of variables manipulated remains constant. The code is below with:

X [N*F]: a numpy array with N samples containing F variables (features);
parts [N]: a numpy array containing the number of the participant of each sample in X;
model_filename: the model template file name for each participant (i.e. I have a model per participant)

My goal is to apply the model of participant p to the data of participant p and and to save its output (i.e. N outputs).

outputs = np.full((X.shape[0],), np.nan)
for curr_part in np.unique(parts):
    print("processing participant {0}".format(curr_part))
    model = load_model(model_filename.format(curr_part)) # I measured the duration of this call (d0)
    idx = (parts == curr_part)
    outputs[idx] = np.squeeze(model.predict(X[idx,:])); # I measured the duration of this call (d1)

Both d1 and d0 increase at each iteration of the loop (the whole loop take 1.5 seconds at iteration 0 and around 8 seconds at iteration 20). I completely fail to understand why. Also interestingly, if I run the code several times in ipython the duration accumulate as long as I do not restart the kernel (i.e. on the second run iteration 0 takes around 8 seconds). Of course I want to run the code several times so this issue is critical on the long run.

I also tried with the following code which takes approx. the same total duration although I cannot measure the time of each call:

unik_parts = np.unique(parts);
models = [(p, load_model(model_filename.format(p))) for p in unik_parts]
outputs = [np.squeeze(m.predict(X[parts == p,:])) for p,m in models]

Python version 2.7

Models are models from keras

So I used the cProfile module to get some stats and I can see that: there are 20 calls to many functions (i.e. my 20 participants), the load_model function is used 20 time with an average of 4s (but I now the call time increase with interations, there are lots of calls to op.py and op_def_library.py which take times. But sincerly my knowledge stops there and I have troubles interpreting these results. — Tabs, Apr 10 '17 at 12:27
sorry I updated the comment above (I precedently hit enter and validated the comment by mistake) — Tabs, Apr 10 '17 at 12:32
I also have quite a few calls (around 250) to the tensoflow backend such as get_session_initialize_variables, etc. — Tabs, Apr 10 '17 at 12:34
Sorry, I've only ever profiled Java using VisualVM. Hopefully someone with more knowledge about python profiling sees this. Good luck. — Carcigenicate, Apr 10 '17 at 12:34
With times like 1.5s this is a complex calculation (keras, tensorflow) on large data. It's hard to tell what changes from iteration to iteration. problem size? accumulated results? And of course it is impossible to replicate your situation and run our own tests to make up for what you don't tell us. — hpaulj, Apr 10 '17 at 15:51

score 0 · Answer 1 · answered Sep 30 '22 at 15:59

I've seen this quite a few times when preprocessing data; Typically, in my experience, the memory usage creeps up after each iteration with each following iteration slowing down slightly.

I find that the easiest way to solve this is to separate the tasks into different processes and then use an orchestration process to manage the program flow.

When each task is completed, the associated process is culled and your resources can continue to be allocated to the next task in the flow. This is most helpful for keeping long-running processes crisp.

You could structure the process in this way:

Parent Process
     |_ Pickle Input to Child Proc
     |_ Trigger Child Proc
            |_ Collect Input
            |_ Complete Task
            |_ Pickle Output
     |_ Collect Output



Parent Process -> pickle input -> Child Process
      ^                              |
      |                              |
      ----------------pickle output <-

One of the things you can do to manage the task flow is to create an id and use it to create an empty file, then pass that id to the child process and, once the work is complete, delete it with the child process. This is a simple and convenient way for the parent process to know the child process is complete.

Python loop taking more time at each iteration

1 Answers1