1

I need to compare two columns with specific values to get a sum at the end of all the records that match.

For example: in the 'Survived' column the value for each record must be 1 and for the 'Pclass' column the value of the record can be 1 or 2

I tried the following code but Python throw a ValueError

df['Match'] = np.where((df['Survived'] == 1) and (df['Pclass'] == 1 or 2))

With this im expecting get the calculation of how many people survived and their class was 1 or 2

yezzussss
  • 11
  • 1

2 Answers2

1

This is a natural fit for .query():

>>> df = pd.DataFrame([dict(Survived=1, Pclass=2),
                       dict(Survived=1, Pclass=3)])
>>> df
   Survived  Pclass
0         1       2
1         1       3
>>> 
>>> df.query('Survived == 1 and Pclass in [1, 2]')
   Survived  Pclass
0         1       2
J_H
  • 17,926
  • 4
  • 24
  • 44
0

Hi if you trying to get records/rows by multiple conditions you can try this.

Hope it works, i don’t have the exact dataset to test it. So if it works please give me a like or thumbs up.

import numpy as np

df['Match'] = np.where(
np.logical_and(df['Survived'] == 1, df['Pclass'].isin([1, 2])), 1, 0)
result = df['Match'].sum()
print(result)

Try it and i hope it works for you. Leave a comment :)

buddemat
  • 4,552
  • 14
  • 29
  • 49
AbdoCherry
  • 23
  • 6