0

I have a dataframe that describes whether work sites have tools. It has a simple true/false for describing if the tool is present. Unfortunately, the tool names are not standardized but I do have a list of them.

The dataframe looks like this:

+-------+---------+-------+-------+-------+-------+
| Index |  County | State | JDKFK | YSAFQ | MDFKN |
+-------+----_----+-------+-------+-------+-------+
|     0 | '001'   | '05'  | TRUE  | TRUE  | TRUE  |
|     1 | '030'   | '05'  | FALSE | TRUE  | FALSE |
|     2 | '005'   | '05'  | TRUE  | TRUE  | FALSE |
+-------+---------+-------+-------+-------+-------+

Using my list, how can I produce a new column that has the sum of TRUE in the specified columns?

cols = ['JDKFK ', 'YSAFQ ', 'MDFKN ']

+-------+--------+-------+-------+-------+-------+----------+
| Index | County | State | JDKFK | YSAFQ | MDFKN | Tool_Sum |
+-------+--------+-------+-------+-------+-------+----------+
|     0 | '01'   | '05'  | TRUE  | TRUE  | TRUE  |        3 |
|     1 | '01'   | '05'  | FALSE | TRUE  | FALSE |        1 |
|     2 | '01'   | '05'  | TRUE  | TRUE  | FALSE |        2 |
+-------+--------+-------+-------+-------+-------+----------+

Count occurrences of False or True in a column in pandas does not answer my question because that post defines how to count True/False in a single column (vertical) whereas my question is how to count True/False across multiple columns (horizontal).

adin
  • 783
  • 3
  • 13
  • 27
  • Can you clarify what exactly the issue is? Please see [ask], [help/on-topic]. – AMC Apr 01 '20 at 21:54
  • Does this answer your question? [Count occurrences of False or True in a column in pandas](https://stackoverflow.com/questions/53550988/count-occurrences-of-false-or-true-in-a-column-in-pandas) – AMC Apr 01 '20 at 21:56

1 Answers1

3

You could use sum:

df['tool_sum'] = df[['JDKFK', 'YSAFQ', 'MDFKN']].sum(1)
print(df)

Output

   Index County State  JDKFK  YSAFQ  MDFKN  tool_sum
0      0   '01'  '05'   True   True   True         3
1      1   '01'  '05'  False   True  False         1
2      2   '01'  '05'   True   True  False         2

The sum(1) means that is going to sum across the second axis (i.e 1 because it counts from 0). Another way to see it sum across columns.

Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76