Pandas - Counting rows in a df to discover the survival rate each day

Question

.

Hello, guys!

I have a dfA (Table A) containing the amount of days that some products have been available (days_survived). I need to count the number of products that were available each day in total (Table B). I mean, I need counting rows in dfA to discover the survival rate each day for the first 5 days (df2).

Table A:

+-------+--------------+
| id    | days_survived|
+-------+--------------+
| 1     |  1           |
| 2     |  3           |
| 3     |  10          | 
| 4     |  40          |
| 5     |  4           |
| 6     |  9           |
+-------+--------------+

Table B (Expected results analyzing the first 5 days):

+-------+----------------+
| day   | #count_survived|
+-------+----------------+
| 1     |  6             |
| 2     |  5             |
| 3     |  5             | 
| 4     |  4             |
| 5     |  3             |
+-------+----------------+

This result means that on the first day 6 products in total were available, then only 5 on the second and third day, then only 4 on the fourth day and finally only 3 on the fifth day.

code:

# create df
import pandas as pd
d = {'id': [1,2,3,4,5,6], 'days_survived': [1,3,10,40,4,9]}
dfA = pd.DataFrame(data=d)

Could anyone help me, please? :)

jezrael · Accepted Answer · 2019-01-31T06:30:02.877

Use list comprehension with flattening and filtering and then count:

comp = [y for x in dfA['days_survived'] for y in range(1, x + 1) if y < 6]
s = pd.Series(comp).value_counts().rename_axis('day').reset_index(name='#count_survived')
print (s)
   day  #count_survived
0    1                6
1    3                5
2    2                5
3    4                4
4    5                3

Another solution with Counter:

from collections import Counter

comp = [y for x in dfA['days_survived'] for y in range(1, x + 1) if y < 6]
d = Counter(comp)
df = pd.DataFrame({'day':list(d.keys()), '#count_survived':list(d.values())})

You rock! Thanks for your help again! : ) – Thaise Jan 31 '19 at 06:29 — Thaise, Jan 31 '19 at 06:29

Jithin P James · Answer 2 · 2019-01-31T08:57:47.783

This is using Collections, creating a list of all occurrences of days for which an item was present and then counting the number of occurrences of each day from the list

import pandas as pd
import numpy as np
from collections import Counter

df = pd.DataFrame(data={'id': [1,2,3,4,5,6], 'days_survived': [1,3,10,40,4,9]})
# We will create a new column having values as a list of all the days for which item was present
df['Days'] = df.apply(lambda a :  list(np.arange(1,a.days_survived+1)),axis=1)
# Applyin Counter to the flattened list of all elements in 'Days' column
cnt= Counter([item for items in list(df['Days']) for item in items])
#Converting cnt Counter object to Dataframe
df_new = pd.DataFrame(data= {'Days':list(cnt.keys()),'count':list(cnt.values())})

Hope this helps.

Pandas - Counting rows in a df to discover the survival rate each day

2 Answers2