8

Is there a more concise way to split a list into two lists by a predicate?

errors, okays = [], []
for r in results:
    if success_condition(r):
        okays.append(r)
    else:
        errors.append(r)

I understand that this can be turned into an ugly one-liner using reduce; this is not what I'm looking for.

Update: calculating success_condition only once per element is desirable.

9000
  • 39,899
  • 9
  • 66
  • 104
  • Your code is fine. All proposed solutions are looping and appling the filtering function twice. – Paolo Moretti Aug 15 '12 at 18:41
  • @PaoloMoretti: I agree with your first sentence, but your second is wrong-- see dbaupp's. – DSM Aug 15 '12 at 18:42
  • @PaoloMoretti, and dfb's second suggestion. – huon Aug 15 '12 at 18:42
  • 3
    possible duplicate of [python equivalent of filter() getting two output lists (i.e. partition of a list)](http://stackoverflow.com/questions/4578590/python-equivalent-of-filter-getting-two-output-lists-i-e-partition-of-a-list) – ephemient Aug 15 '12 at 18:43
  • I also agree with @PaoloMoretti's first statement, this is code is fine. There's no advantage of these solutions over yours – dfb Aug 15 '12 at 18:45
  • Many of the alternatives here involve calling `success_condition` twice for each result. This can be a bad idea if it is expensive to execute or has side effects – John La Rooy Aug 15 '12 at 19:27
  • @gnibbler, that was already discussed about 3 comments up. – huon Aug 15 '12 at 19:51
  • @dbaupp, just making it clear _why_ it's potentially bad to call it twice – John La Rooy Aug 15 '12 at 19:58
  • "I understand that this can be turned into an ugly one-liner using `reduce`" – BTW: this is a tautology: `reduce` is a general method of iteration, *everything* that can be expressed with iteration (thus also everything that can be expressed with `map`, `filter` or comprehensions) can be expressed with `reduce`, see [the Wikipedia page for `fold`](http://Wikipedia.Org/wiki/Fold_(higher-order_function)#Universality) for a proof. – Jörg W Mittag Aug 15 '12 at 20:30
  • @JörgWMittag: I do know that `list` and `fold()` are different ways to say the same thing, and any catamorphism can be expressed using fold. What I'm interested in is a more compact but readable way to express the 6-liner in Python, maybe using some built-in / standard function I failed to think about. – 9000 Aug 15 '12 at 21:08
  • Usually, that operation is called `partition` for the special case of partitioning a collection into two collections based on a boolean predicate and `group_by` for the more general case of grouping by an arbitrary key. – Jörg W Mittag Aug 15 '12 at 22:14
  • possible duplicate of [Python: split a list based on a condition?](http://stackoverflow.com/questions/949098/python-split-a-list-based-on-a-condition) – user Sep 21 '14 at 03:02

6 Answers6

6

Maybe

for r in results:
    (okays if success_condition(r) else errors).append(r)

But that doesn't look/feel very Pythonic.


Not directly relevant, but if one is looking for efficiency, caching the method look-ups would be better:

okays_append = okays.append
errors_append = errors.append

for r in results:
    (okays_append if success_condition(r) else errors_append)(r)

Which is even less Pythonic.

huon
  • 94,605
  • 21
  • 231
  • 225
4

How about

errors = [ r for r in results if not success_condition(r)]
okays = [ r for r in results if success_condition(r)]

Or

bools = [ success_condition(r) for r in results ] 

and then replace above (via zip or enumerate) if success_condition is a costly call..

dfb
  • 13,133
  • 2
  • 31
  • 52
  • But in order to get the good ones, you need to filter again -- effectively checking each element twice (which is potentially expensive). – mgilson Aug 15 '12 at 18:34
  • Yes, this is true, but he asked for concision, not efficiency – dfb Aug 15 '12 at 18:35
  • @dfb I deleted it because it wasn't actually more succinct. – Waleed Khan Aug 15 '12 at 19:10
  • I think by the time you combine `bools` with `zip` it's not going to be very concise anymore – John La Rooy Aug 15 '12 at 19:42
  • @gnibbler - really? it turns into `okays = [ r[0] for r in zip(results,bools) if r[1]]`. Not too much worse... or were you referring to the fact that it's 3 lines instead of 2? – dfb Aug 15 '12 at 20:02
  • @dfb, comparing the 3 lines to the 6 from the OP, in my head it looks like you're building a list of bools and not really gaining anything – John La Rooy Aug 15 '12 at 20:06
4
errors, okays = [], []
for r in results:
    (errors, okays)[success_condition(r)].append(r)
John La Rooy
  • 295,403
  • 53
  • 369
  • 502
1

I find the following construction to be elegant and readable:

successes = [r for r in results if success_condition(r)]
failures = [r for r in results if r not in successes]

And it only calls the success_condition function once per element instead of twice.

Yeah, it's O(n²) when using lists, but there is little perceptible performance penalty for lists with fewer than a million elements.

If order doesn't matter, you could use sets with {} instead of []:

okays = {r for r in results if success_condition(r)}
errors = set(results) - okays

Or rely on the fact that dicts retain insertion order in Python 3.7+:

okays = {r: None for r in results if success_condition(r)}
errors = [r for r in results if r not in okays]
ctrueden
  • 6,751
  • 3
  • 37
  • 69
0

What about the filter function?

okays = filter(success_condition, results)
errors = filter(lambda (x): not success_condition(x), results)
juan.facorro
  • 9,791
  • 2
  • 33
  • 41
0

Use a generator expression or list comprehension with side-effect.(just to make it look concise):

>>> errors, okays = [], []
>>> [okays.append(r) if success_condition(r) else errors.append(r)  for r in results]

with generator expression:

>>> errors, okays = [], []
>>> list(okays.append(r) if success_condition(r) else errors.append(r)  for r in results)
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
  • This doesn't really work. Nothing in the generator expression is executed yet. You need to use a list comprehension (or iterate through the generator expression somehow, e.g. `[None for _ in if False]`). – huon Aug 15 '12 at 18:46
  • @dbaupp I guess just `list()` will do fine. – Ashwini Chaudhary Aug 15 '12 at 18:49
  • Of course (that is equivalent to a list comprehension). But my other proposal avoids allocating a large block of memory for the list which is then immediately discarded. – huon Aug 15 '12 at 18:54
  • 1
    (Although doing some quick testing just now indicates that my trickery is slower than the plain `list(..)` call, haha!) – huon Aug 15 '12 at 18:55
  • 3
    `any()` is better as `list.append` always returns `False` it will run through the whole thing – John La Rooy Aug 15 '12 at 20:00