generate min combining multiple lists, return lowest uncombined list?

Question

I have a list of lists that looks like this:

aList = [[10564, 15, 1], [10564, 13, 1], [10589, 18, 1], [10637, 39, 1], [10662, 38, 1], [10837, 45, 1], [3, 17, 13], [7, 21, 13], [46, 1, 13]]

I wanted to find the list with the lowest second element, if the third element is 1, so above it is [10564, 13, 1]. I did that with some help(although I don't fully understand key=lambda k:k[1], what does that mean?):

i = min((x for x in aList if (str(x[2])=="1")), key=lambda k:k[1])

The way I understood to do it myself was:

target = min(x[1] for x in aList if (str(x[2])=="1"))
matches = [x for x in aList if (x[1] == target) and (str(x[2])=="1")]

However I want to change this now, I want to instead compare all neighbouring lists, add their second elements together, find the pair of lists with the minimum and then finally return the one list that had the minimum second element from that pair, this would all be if the third element is 1. How do you do this?

EDIT: sample input:

aList = [[10564, 15, 1], [10564, 13, 1], [10589, 18, 1], [10637, 39, 1], [10662, 38, 1], [10837, 45, 1], [3, 17, 13], [7, 21, 13], [46, 1, 13]]

Sample output stage one:

[10564, 15, 1], [10564, 13, 1]

This is the lowest neighbouring pair, as 15+13 = 28 and no other pair has that low an addition of the second elements.

Final output is the lowest of this pair:

[10564, 13, 1]

`key` is just a function called, it's result is what `min` checks for the smallest of. It allows you to customize functions like `min`, `max`, `sorted` — jamylak, Apr 19 '13 at 10:11
sample input/output added. @MartijnPieters That stuff is int ehre because in the actual code The values I pass in are a mix of strings and ints. I dont hardcode in 1 etc, I just make sure they are all strings or `==` returns false due to the differing types. — Paul, Apr 19 '13 at 10:27
@Paul: Right; you may want to move that filter out of the function I give you in my answer then. — Martijn Pieters, Apr 19 '13 at 10:29
@jamylak so wwhat would `key=lambda k:k[1]` be checking, the first and second element? — Paul, Apr 19 '13 at 10:29
The key function is given the current item as it's argument: so `k[1]` returns it's second element — jamylak, Apr 19 '13 at 10:30

Martijn Pieters · Accepted Answer · 2013-04-19T11:10:23.627

The key argument tells min what to determine the minimum by.

Without the key argument, min compares any given whole tuple with the other tuples, and then the first element within the tuple is compared first. The key function is called for each element in the input sequence and the minimum element is determined solely by the return value of that key. lambda k: k[1] returns the second element in the tuple.

Compare the following two outcomes:

>>> example = [(5, 1), (4, 2), (3, 3), (2, 4), (1, 5)]
>>> min(example)
(1, 5)
>>> min(example, key=lambda element: element[1])
(5, 1)

In the first example, no key function is supplied and min() compares each tuple as-is, in the second example, min() only looks at what the key() function returns and thus picks a different element as the minimum.

You can go really overboard with that key function:

>>> min(example, key=lambda element: (element[0] / element[1]) + element[1])
(4, 2)

Using str is not really needed, and the whole expression is overly verbose; you can simplify it down to:

i = min((x for x in aList if x[2] == 1), key=lambda k: k[1])

or using operator.itemgetter:

from operater import itemgetter

i = min((x for x in aList if x[2] == 1), key=itemgetter(1))

To compare neighboring pairs, you'd need a itertools helper function:

from itertools import tee, izip

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = tee(iterable)
    next(b, None)
    return izip(a, b)

It's easier to then move the 'last element is 1' criteria to a filter, using itertools.ifilter:

from itertools import ifilter

last_is_one = ifilter(lambda x: x[2] == 1, aList)
paired = pairwise(last_is_one)

Now we can do the real work; for each pair of neighbouring lists, find the pair whose second element sum is the lowest, then from that pair find the lowest by second element:

# find minimum pair by second elements summed
minpair = min(paired, key=lambda pair: pair[0][1] + pair[1][1])
minimum = min(minpair, key=itemgetter(1))

To put that all together, with the responsibility of filtering left to the caller of the function:

from operater import itemgetter
from itertools import tee, izip

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = tee(iterable)
    next(b, None)
    return izip(a, b)

def neighbouring_minimum(iterable):
    paired = pairwise(iterable)

    # find minimum pair by second elements summed
    minpair = min(paired, key=lambda pair: pair[0][1] + pair[1][1])
    return min(minpair, key=itemgetter(1))

For your sample input that gives:

>>> from itertools import ifilter
>>> aList = [[10564, 15, 1], [10564, 13, 1], [10589, 18, 1], [10637, 39, 1], [10662, 38, 1], [10837, 45, 1], [3, 17, 13], [7, 21, 13], [46, 1, 13]]
>>> filtered = ifilter(lambda x: x[2] == 1, aList)
>>> neighbouring_minimum(filtered)
[10564, 13, 1]

You can even move the criteria for the minimum to a separate key argument:

def neighbouring_minimum(iterable, key=None):
    if key is None:
        # default to the element itself
        key = lambda x: x

    paired = pairwise(iterable)

    # find minimum pair by key summed
    minpair = min(paired, key=lambda pair: sum(map(key, pair)))
    return min(minpair, key=key)

neighbouring_minimum(ifilter(lambda x: x[2] == 1, aList), key=itemgetter(1))

thanks for the answer, very thorough, gonna run through it with docs to understand. — Paul, Apr 19 '13 at 10:51
Going through your answer now, as I understand the other one. Why have you used ifilter instead of filter, don't they both return only if the condition is true? — Paul, Apr 19 '13 at 15:26
@Paul: `ifilter` returns elements on demand, `filter()` creates a full copy, a new list with the filtered elements. If your input list is large, `ifilter` is going to be more efficient for this task. — Martijn Pieters, Apr 19 '13 at 15:29
new to python sorry, First you filter the list to an iterable with the condition ==1. This is passed to a function and tee splits the iterable into multiple iterables.is this multiple copies of the same iterable? Then you get the next element of b(`none` I dont understand). Is it that you shift b to the right...then call izip and it chains a and b together in pairs? So a s0 is matched with b s1 and so on, making all the pairs I want? Then you get the min of all these pairs second elements summed and then finally get the min of this pair couple, passing in the second element. This correct? — Paul, Apr 19 '13 at 16:06
if next moves b to the next element, what happens when the last element of a is attempted to be chained to something, there would be no b value? And it is ignored I take it? Is this why izip is used vs zip? Sorry all these functions are completely new to me. — Paul, Apr 19 '13 at 16:13
@Paul: The `ifilter()` return value is an iterable too. `tee()` indeed returns multiple copies of an iterable (with an internal buffer to keep things efficient). The `next(b, None)` returns `None` if `b` is empty (raises `StopIteration`). The rest is spot on. :-) — Martijn Pieters, Apr 19 '13 at 16:16
@Paul: `izip()` will only return complete pairs, so when `a` has one element left and `b` is empty, that pair is not returned. So for a sequence of length `n`, the last pair is `(s[n-2], s[n-1])`. `izip()` is used because like `ifilter()` it is more efficient; it takes elements one at a time, not creating a whole list in memory first like `zip()` (in Python 2 at least) would do. — Martijn Pieters, Apr 19 '13 at 16:17
Great thanks, I feel like I've learned a lot from your answer. :D Rule of thumb...use i before a function to make it more efficient ;) — Paul, Apr 19 '13 at 16:27
can you throw conditionals into lambda? `minpair = min(paired, key=lambda pair: pair[0][1] + pair[1][1])` I notice sometimes I might get a dublicate x[0]. `10564, 15, 1], [10564, 13, 1]` If I do I want to ignore the first one, as I'm paring the same thing to itself. A lowest pair with the same element 0 is invalid. Will try and edit that in. — Paul, Apr 19 '13 at 17:47
Use a conditional expression: `sum(key, pair) if pair[0][0] != pair[1][0] else sys.maxint` for example returns a really large number if the first values in the pair are equal, effectively removing the pair from consideration. — Martijn Pieters, Apr 19 '13 at 17:57

score 3 · Answer 2 · edited May 23 '17 at 10:25

For general understanding of lambda there are several excellent answers here that I couldn't hope to reproduce like this one.

In your specific case: (I've cleaned it up slightly)

i = min((x for x in aList if x[2]==1)), key=lambda k:k[1])

you should read it as :

some_generator = (x for x in aList if x[2]==1)) # a generator of list elements 
                                                # where the 3rd element == 1
i = min(some_generator, key=lambda k:k[1])      # minimum with respect to the
                                                # 2nd element

lambda in the above code intercepts each 3-element list passed to min() and returns the 2nd value. This tells min() not to minimise by the first element, which it would do by default, but by the second. Hopefully the simple case below makes this clear.

>>> min([3,0],[1,2],[2,1])
[1,2]
>>> min([3,0],[1,2],[2,1], key=lambda x:x[1])
[3,0]

Now for your second question, I think this will achieve what you want...

[Edit: I've removed the wrap around functionality. See comments below. This also meant the modulus is redundant so the code is much cleaner!]

# Accept only elements where 3rd value is a 1 
only_ones = [x for x in aList if x[2]==1]

# neighbours will hold pairs of entries for easy lambda minimization
neighbours = []
for i, element in enumerate(only_ones[:-1]) :
   neighbours.append([element, only_ones[i+1]])

# Get minimum pair with respect to the sum of the pairs' middle elements first
# then get minimum of the resulting pair with respect to to middle element
i = min(min(neighbours, key=lambda k: k[0][1] + k[1][1]),
        key=lambda k:k[1])

In hindsight, neighbours is probably better as a generator

neighbours = ([x, only_ones[i+1]] for i, x in enumerate(only_ones[:-1]))

And finally, for fans of excessive, unreadable list/generator comprehension

i = min(min(([x, only_ones[i+1]] 
              for i, x in enumerate([x for x in aList if x[2]==1][:-1])) ,
                 key=lambda k: k[0][1] + k[1][1]),
                    key=lambda k: k[1])

(sorry I couldn't resist!)

Thanks for that link, very nice. Will give your answer a go too, what does that wraps around line mean? I have to go read about enumerate also, thanks! — Paul, Apr 19 '13 at 10:52
You do a lot more work than is needed here, creating 2 extra lists (`only_ones` and `neighbours`) where generators would only produce elements as needed. — Martijn Pieters, Apr 19 '13 at 11:03
also minor correction with brackets, `neighbours.append([element, only_ones[(i+1) % len(only_ones)]])` — Paul, Apr 19 '13 at 11:05
@Paul: Wrapping around here means the last element is paired with the first element, so the last pair is `(only_ones[-1], only_ones[0])`. You'll need to make sure that that is what you want. For your one example input, that extra pair makes no difference (`15 + 45` is not going to beat `15 + 13`) but for other inputs that may make a big difference. — Martijn Pieters, Apr 19 '13 at 11:11
Ah I see, I don't want it to wrap around! I should have said that, that would be very bad for my answer. — Paul, Apr 19 '13 at 11:14
@MartijnPieters Thanks for fielding these questions. @Paul I added the wrap around so that the for loop wouldn't break while iterating though the entire list. As you don't want to do this, you can remove the functionality by either a) iterate only up, not including the last element with `for ... in enumerate(only_ones[:-1])` or b) ignore the last neighbours element `i = min(min(neighbours[:-1],...)` . a) is probably technically best as it's one less loop iteration to execute! — ejrb, Apr 19 '13 at 12:43

generate min combining multiple lists, return lowest uncombined list?

2 Answers2