0

I have a dataframe:

df = C1  C2  E  
     1    2  3
     4    9  1
     3    1  1 
     8    2  8
     8    1  2

I want to add another columns that will have the count of the value that is in the columns 'E' in all the dataframe (in the column E) So here the output will be:

df = C1. C2. E. cou 
     1.   2. 3.  1 
     4.   9. 1.  2
     3.   1. 1   2
     8.   2. 8.  1
     8.   1. 2.  1 #2 appears only one it the column E

How can it be done efficiently ?

Cranjis
  • 1,590
  • 8
  • 31
  • 64

1 Answers1

0

Here's one way. Find the matches and add them up.

import pandas as pd

data = [
    [1,2,3],[4,9,1],[3,1,1],[8,2,8]
]

df = pd.DataFrame( data, columns=['C1','C2','E'])
print(df)

def count(val):
    return (df['C1']==val).sum() + (df['C2']==val).sum()

df['cou'] = df.E.apply(count)
print(df)

Output:

   C1  C2  E
0   1   2  3
1   4   9  1
2   3   1  1
3   8   2  8
   C1  C2  E  cou
0   1   2  3    1
1   4   9  1    2
2   3   1  1    2
3   8   2  8    1
Tim Roberts
  • 48,973
  • 4
  • 21
  • 30
  • please notice that the count is within column E. I change the example to make it clearer – Cranjis Nov 30 '22 at 18:51
  • No, it's not. The value you're SEARCHING for is in column E. My result matches your desired result exactly. – Tim Roberts Nov 30 '22 at 18:52
  • please see the last edit – Cranjis Nov 30 '22 at 18:52
  • So, are you saying that columns C1 and C2 are totally irrelevant to the problem? Then why did you include them, especially when their contents match your desired data? I hope you can see how to change my code to sum up `df['E']` instead of `df['C1']` and `df['C2']`. – Tim Roberts Nov 30 '22 at 18:55
  • yes I just wonder is there a more efficient way to do it – Cranjis Nov 30 '22 at 18:56
  • You need to search the column once for every value. You could iterate over the column and keep a running count yourself, I suppose, but you'd still have to apply the count as a new column at the end. – Tim Roberts Nov 30 '22 at 18:57