Python: building a CCDF out of a list

Question

I have the following list, where the 1st element is a generic value and the second is the number of occurrences of that value:

mylist=[(2, 45), (3, 21), (4, 12), (5, 7), 
(6, 2), (7, 2), (8, 3), (9, 2), 
(10, 1), (11, 1), (15, 1), (17, 2), (18, 1)]

and I want to compute the CCDF (Complementary Cumulative Distribution Function) of those values appearing as second element of each tuple.

My code:

ccdf=[(i,sum(k>=i for i in mylist)) for i,k in mylist]

But this is not working as the outcome is void:

ccdf=[(2, 0), (3, 0), (4, 0), (5, 0), 
(6, 0), (7, 0), (8, 0), (9, 0), 
(10, 0), (11, 0), (15, 0), (17, 0), (18, 0)]

The sum of values in the second position in each tuple is 100. So, I would like to know how many times I have a value >= 2 (100-44=56), how many times I have a value >= 3 (100-44-21=35), and so forth. The result would thus be:

ccdf=[(2, 56), (3, 35), (4, 23), (5, 16), 
(6, 14), (7, 12), (8, 9), (9, 7), 
(10, 6), (11, 5), (15, 4), (17, 3), (18, 1)]

What is wrong in my list comprehension?

Why the downvote? I thought we were supposed to justify downvoting. Weren't we? — FaCoffee, Jul 01 '16 at 08:52

score 1 · Accepted Answer · answered Jun 30 '16 at 15:34

Your inner list comprehension is off. There are two issues:

The correct syntax for a conditional (list) comprehension is: [x for x in someiterable if predicate(x)]
You are using the same variable names in both iterations. That is confusing and error prone.

Try this instead:

ccdf=[(i,sum(k2 for i2,k2 in mylist if i2 >= i)) for i,k in mylist]

score 1 · Answer 2 · answered Jun 30 '16 at 15:50

mylist = [
    (2, 45), (3, 21), (4, 12), (5, 7), (6, 2), 
    (7, 2), (8, 3), (9, 2), (10, 1), (11, 1), 
    (15, 1), (17, 2), (18, 1)
]

def get_sum_of_values(_list):
    return reduce(lambda a, b: a + b[1], _list, 0)


def calculate_ccdf(mylist):
    sum_of_values = get_sum_of_values(mylist)
    return [(_tuple[0], sum_of_values - get_sum_of_values(mylist[0:index+1])) for index, _tuple in enumerate(mylist)]


print calculate_ccdf(mylist)

Python: building a CCDF out of a list

2 Answers2