0

I have a dataframe with about several columns that have conditions that I wish to filter in various combinations. I want to keep all columns where any set of conditions is met.

For instance if four th conditions are

  1. city = "NY" and weather ="Rainy"
  2. city 'Philly' and weather ="Sunny" and time = "Day"
  3. city 'Philly' and weather ="Rainy" and time = "Night"
  4. city 'Albany' and time = "Night"

I want to keep all rows where any of those four conditions are met it would be expressed as

writing that out with a data.iloc["city"] with a bunch of ands or or sounds messy and there is room for error as my conditions grow

What do you think is the best way to handle this?

For clarification the below dataframe is before running the procedure

City Weather Time
NYC Sunny Day
NYC Rainy Night
Philly Sunny Day
Philly Rainy Day
Philly Rainy Night
Seattle Windy Day
Albany Rainy Night
Albany Sunny Day

The following is the resulting dataframe

City Weather Time
NYC Rainy Night
Philly Sunny Day
Philly Rainy Night
Albany Rainy Night
N27
  • 31
  • 5

1 Answers1

0

you could use '&' and '|' like this:

df[
((df['City']=='NYC') & (df['Weather']=='Rainy'))
| ((df['City']=='Philly') & (df['Weather']=='Sunny') & (df['Time']=='Day'))
| ((df['City']=='Philly') & (df['Weather']=='Rainy') & (df['Time']=='Night'))
| ((df['City']=='Albany') & (df['Time']=='Night'))
  ]
Ezer K
  • 3,637
  • 3
  • 18
  • 34
  • In practice I have about 7 more conditions than this, and I did not want to do this. Is there a better way? – N27 May 05 '22 at 19:08
  • hmm, not sure, I think it depends on your specific logic, if it could be expressed differently – Ezer K May 05 '22 at 19:14
  • yeah I want to avoid the and / or (&/|) – N27 May 05 '22 at 19:15
  • maybe try this approach:https://stackoverflow.com/questions/48569166/multiple-if-else-conditions-in-pandas-dataframe-and-derive-multiple-columns – Ezer K May 05 '22 at 19:19