-1

I have a data-frame like this,

df,
A   B   C   D   Final
a   b   c   d   Valid
a       c       Valid
a       c   d   Valid
a               Valid

I want to calculate how many % of each column present in the Final Column.

My desired output is,

output = a=4,b=1,c=3,d=2

Please help

Pyd
  • 6,017
  • 18
  • 52
  • 109

1 Answers1

2

If empty values are missing use drop with count:

print (df)
   A    B    C    D  Final
0  a    b    c    d  Valid
1  a  NaN    c  NaN  Valid
2  a  NaN    c    d  Valid
3  a  NaN  NaN  NaN  Valid

df = df.drop('Final', axis=1).count()
print (df)
A    4
B    1
C    3
D    2
dtype: int64

If values are empty strings first compare by eq and sum Trues:

print (df)
   A  B  C  D  Final
0  a  b  c  d  Valid
1  a     c     Valid
2  a     c  d  Valid
3  a           Valid

df = df.drop('Final', axis=1).ne('').sum()
print (df)
A    4
B    1
C    3
D    2
dtype: int64

print (df.to_dict())
{'B': 1, 'A': 4, 'C': 3, 'D': 2}

d = df.div(len(df.index)).mul(100).to_dict()
print (d)
{'B': 25.0, 'A': 100.0, 'C': 75.0, 'D': 50.0}
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252