dictionary-comprehension with conditional calling nested list-comprehension

Question

I have a dictionary of keys and lists. I'd like to iterate through the dictionary, get each list, iterate through each list and apply a condition, then append that filtered list to a new dictionary.

The function already works imperatively. Can I do the same functionally with list and dict comprehensions? The main blocker is that the wrapping dict-comp has a conditional which needs length of the list-comp.

Here it is working imperatively:

filtered_prediction_dict = {}
for prediction, confidence_intervals in prediction_dict.items():
    filtered_confidence_intervals = []
    for i in confidence_intervals:
        if i > threshold:
            filtered_confidence_intervals.append(i)
    if len(filtered_confidence_intervals) >= 1:
        filtered_prediction_dict[prediction] = filtered_confidence_intervals

I was wondering if I could do the same thing functionally with comprehensions, something like this:

filtered_prediction_dict = {prediction: [i for i in confidence_intervals if i > threshold] for prediction, confidence_intervals in prediction_dict.items() if len(filtered_confidence_intervals) >= 1}

Of course, python's linter points out that filtered_confidence_intervals hasn't yet been defined in len(filtered_confidence_intervals) in the conditional.

Any way around this?

`"I was wondering if I could do the same thing functionally with comprehensions"` please don't, not if you want to understand your code 1 week from now — DeepSpace, Sep 05 '19 at 08:19
Just make `filtered_confidence_intervals` a list comprehension, but leave the rest as is. — jonrsharpe, Sep 05 '19 at 08:20
Yes, I was hoping there was a meta way to avoid computing the list comprehension twice, but it seems unavoidable. The any() function is useful. — Hung-Ray Ho, Sep 09 '19 at 06:08

j-i-l · Accepted Answer · 2019-09-06T11:14:47.947

You can put the two conditions you apply on each of the confidence intervals in a single statement. Also, I recommend putting the filtering for confidence intervals in a list comprehension statement in any case.

The two conditions:

confidence interval > threshold (the if i > threshold)
one or more confidence intervals are are bigger than the threshold (the len(filtered_confidence_intervals) >= 1)

Expressed in a single statement:

any(ci > threshold for ci in confidence_intervals)

The resulting list-comprehension version (split up for readability):

{
    p: [ci for ci in cis if ci > threshold]  # only keep ci > threshold
    for p, cis in prediction_dict.items()  # iterate through the items
    if any(ci > threshold for ci in cis)  # only consider items with at least one ci > threshold
}

IMHO this is not less readable than for-loops, but I guess this is a matter of taste and use.

If you want to keep for-looping:

filtered_prediction_dict = {}
for prediction, confidence_intervals in prediction_dict.items():
    if any(ci > threshold for ci in confidence_intervals):
        filtered_prediction_dict[prediction] = [ci for ci in confidence_intervals if ci > threshold]

A note to your comment about the python's linter pointing out that filtered_confidence_intervals hasn't yet been defined:

Very often linters are quite accurate and this case is no exception. filtered_confidence_intervals is defined per item in prediction_dict so there is no way you can iterate through prediction_dict and have a test about the length of filtered_confidence_intervals.

You would need to replace the statement:

len(filtered_confidence_intervals) >= 1

in the list comprehension by

len([ci for ci in confidence_intervals if ci > threshold]) >= 1

kederrac · Answer 2 · 2019-09-05T08:32:11.983

you can use:

filtered_prediction_dict = {prediction: [i for i in confidence_intervals if i > threshold] for prediction, confidence_intervals in prediction_dict.items() if any(e >= threshold for e in  confidence_intervals)}

in this way you check that your filtered_prediction_dict doesn't have any empty list

or you can use:

filtered_prediction_dict = {prediction: [i for i in confidence_intervals if i > threshold] for prediction, confidence_intervals in prediction_dict.items() if max(confidence_intervals) >= threshold}

the second version iterate twice over each element from your lists, the first has some redundant iterations, but even so both solutions may be faster than using for statements

dictionary-comprehension with conditional calling nested list-comprehension

2 Answers2