81

given

patient_id  test_result has_cancer
0   79452   Negative    False
1   81667   Positive    True
2   76297   Negative    False
3   36593   Negative    False
4   53717   Negative    False
5   67134   Negative    False
6   40436   Negative    False

how to count False or True in a column , in python?

I had been trying:

# number of patients with cancer

number_of_patients_with_cancer= (df["has_cancer"]==True).count()
print(number_of_patients_with_cancer)
cs95
  • 379,657
  • 97
  • 704
  • 746
Ney J Torres
  • 1,611
  • 3
  • 12
  • 14
  • Does this answer your question? [Count occurences of True/False in column of dataframe](https://stackoverflow.com/questions/53415751/count-occurences-of-true-false-in-column-of-dataframe) – Serge Stroobandt Aug 24 '21 at 13:13

7 Answers7

102

So you need value_counts ?

df.col_name.value_counts()
Out[345]: 
False    6
True     1
Name: has_cancer, dtype: int64
Abrar Jahin
  • 13,970
  • 24
  • 112
  • 161
BENY
  • 317,841
  • 20
  • 164
  • 234
  • Thanks!! how do I print only "False"? (I'm checking https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.value_counts.html but is not really clear) – Ney J Torres Nov 30 '18 at 03:58
  • 15
    +1@NeyJTorres For the record, you can get just the `False` number by appending `.loc[False]`, as in `df.has_cancer.value_counts().loc[False]`. However, when you only need *either* `True` or `False` (but not both), I think it's just easier to use coldspeed's approach of something like `(~df.has_cancer).sum()`. – Mike Apr 08 '19 at 15:51
64

If has_cancer has NaNs:

false_count = (~df.has_cancer).sum()

If has_cancer does not have NaNs, another option is to subtract from the length of the dataframe and avoid negation. Not necessarily better than the previous approach.

false_count = len(df) - df.has_cancer.sum()

And similarly, if you want just the count of True values, that is

true_count = df.has_cancer.sum()

If you want both, it is

fc, tc = df.has_cancer.value_counts().sort_index().tolist()
cs95
  • 379,657
  • 97
  • 704
  • 746
16
0     True
1    False
2    False
3    False
4    False
5    False
6    False
7    False
8    False
9    False

If the panda series above is called example

example.sum()

Then this code outputs 1 since there is only one True value in the series. To get the count of False

len(example) - example.sum()
David Miller
  • 477
  • 5
  • 4
8
number_of_patients_with_cancer = df.has_cancer[df.has_cancer==True].count()
garima5aqua
  • 81
  • 1
  • 1
1

Consider your above data frame as a df

True_Count = df[df.has_cancer == True]

len(True_Count)
Reza Rahemtola
  • 1,182
  • 7
  • 16
  • 30
Hemang Dhanani
  • 175
  • 1
  • 4
0

Just sum the column for a count of the Trues. False is just a special case of 0 and True a special case of 1. The False count would be your row count minus that. Unless you've got na's in there.

gilch
  • 10,813
  • 1
  • 23
  • 28
0

Count True:

df["has_cancer"].sum()

Count False:

(~df["has_cancer"]).sum()

See Boolean operators.

young_souvlaki
  • 1,886
  • 4
  • 24
  • 28