Split tests into even time slaves

Question

Split tests into even time slaves.

I have a full list of how long each test takes.

They are python behave test features.

Slaves are created via Jenkins.

I have tests split out on to x amount of slaves. These slaves run the tests and report back.

Problem: Some of the slaves are getting bigger longer running tests than others. eg. one will take 40 mins and another will take 5 mins.

I want to average this out.

I currently have a list of the file and time it takes.

[
    ['file_A', 501],
    ['file_B', 350],
    ['file_C', 220],
    ['file_D', 100]
]

extra... there are n number of files.

At the moment these are split in to lists by number of files, I would like to split them by the total time taken. eg... 3 slaves running these 4 tests would look like...

[
[
     ['file_A', 501],
],
[
     ['file_B', 350],
],
[
     ['file_C', 220],
     ['file_D', 100]
]
]

Something like that...

Please help

Thanks!

You've not specified how many test slaves you want, and you get different answers depending on the number of test slaves. — Jonathan Leffler, Jul 21 '13 at 10:37

mr2ert · Accepted Answer · 2013-07-21T10:12:18.293

3

You could do something like:

def split_tasks(lst, n):
    # sorts the list from largest to smallest
    sortedlst = sorted(lst, key=lambda x: x[1], reverse=True)
    # dict storing the total time for each set of tasks
    totals = dict((x, 0) for x in range(n))
    outlst = [[] for x in range(n)]
    for v in sortedlst:
        # since each v[1] is getting smaller, the place it belongs should
        # be the outlst with the minimum total time
        m = min(totals, key=totals.get)
        totals[m] += v[1]
        outlst[m].append(v)
    return outlst

Which produces the expected output:

[[['file_A', 501]], [['file_B', 350]], [['file_C', 220], ['file_D', 100]]]

edited Jul 21 '13 at 10:12

answered Jul 21 '13 at 10:06

mr2ert

5,146
1
21
32

@downvoter What warranted the down vote, and how could I improve the answer? – mr2ert Jul 21 '13 at 10:31
I don't really see why this is down-voted. You've parameterized the workload with `n`, the number of test slaves, and you generate a list with one entry for each of the `n` slaves, where the entry is itself a list of the tests to be run by the `n`th slave. This seems to be a fine solution — or would be if you explained a bit more of what you were doing. I can only conclude that the down-voter didn't understand what you've done. – Jonathan Leffler Jul 21 '13 at 10:46
this is the same as [see 6855394](http://stackoverflow.com/questions/6855394/splitting-list-in-chunks-of-balanced-weight), except you provided an implementation , so I support your answer – Jakob Kroeker Jul 21 '13 at 12:20
Sorry this took so long. I have stuck in my results and this has worked beautifully! Thanks @mr2ert – Lazerlightning Jul 21 '13 at 14:28

Steve Barnes · Answer 2 · 2013-07-21T09:44:43.820

1

Sort your tests into descending order of the time taken to run, send one to each slave from the top of the list and then give them another as they finish - using this strategy if a test hangs or takes longer than usual all the other tests will still get finished in the minimum time.

If you can not distribute the tests on completion then allocate a list for each server and "deal" the tests out in same manner.

edited Jul 21 '13 at 09:44

answered Jul 21 '13 at 09:37

Steve Barnes

27,618
6
63
73

He said that he previously knows how much each test will take. – Moayad Mardini Jul 21 '13 at 09:38
I have a full list of how long each test takes. They are python behave test features. Slaves are created via Jenkins. – Lazerlightning Jul 21 '13 at 09:38
1

I don't really see why this is down-voted. This is a valid outline of a solution to the problem. If you have three test slaves, you run the three longest tests first. When the 220 test finishes, it runs the 100 test. The full set of tests completes in the time it takes the 501 test to finish (it can't be shorter than that, of course). If you have two slaves, the one running the 350 test finishes and runs the 220 test; the 501 test finishes and runs the 100 test; and the total time is 601. If you have 4 or more slaves, then of course the tests are simply all run in parallel. – Jonathan Leffler Jul 21 '13 at 10:51

score 0 · Answer 3 · edited May 23 '17 at 11:49

0

It seemed to me that the problem was a Simple Assembly Line Balancing Problem. (see page 2), but I guess it is instead Multi-way number partitioning . Here is a recent related paper by Michael D. Moffitt.

I don't know if there is a python module which solves this problem, but maybe someone at StackOverflow knows?

You could implement an algorithm which approximates the solution:
(Quoting accepted answer at 6855394 )

Greedy:
1. Order the available items descending.
2. Create N empty groups
3. Start adding the items one at a time into the group that has the smallest sum in it.

edited May 23 '17 at 11:49

Community

1
1

answered Jul 21 '13 at 09:55

Jakob Kroeker

333
2
15

using the [salbp](http://phd.ie.lehigh.edu/~aykut/software/index.html) package (depends on **pulp** and **coin-or** ) I managed to get the correct required minimal global time (501); but I don't know yet how to get the corresponding partition... – Jakob Kroeker Jul 21 '13 at 11:33

score 0 · Answer 4 · answered Jul 21 '13 at 10:13

0

This can be solved using a Knapsack algorithm, using the sum of all values divided by the number of slaves.

answered Jul 21 '13 at 10:13

Moayad Mardini

7,271
5
41
58

Dear Moayad Mardini, I' sorry, could you eventually explain/improve your post? In my opinion the problem can only be solved with *modified* Knapsack algorithms, but not with unmodified ones. Or how can you express the a _balanced multi-way number partitioning_ problem as a Knapsack problem? Thanks – Jakob Kroeker Jul 22 '13 at 19:48
Yes, it's a special case of the 0-1 knapsack problem. – Moayad Mardini Jul 22 '13 at 21:04

Split tests into even time slaves

4 Answers4