Let's say a user can input the columns and values to compare for a DF, so we can have:
column_list = ['col1', 'col2', 'col3']
value_list = [val1, val2, val3]
So to select the rows that satisfy where col1 >= val1 AND col2 >= val2 AND col3 >= val3 we would write:
selection = (df['col1'] >= val1) & (df['col2'] >= val2) & (df['col3'] >= val3))
or it can be in the form:
selection = df.loc[(df['col1'] >= val1) & (df['col2'] >= val2) & (df['col3'] >= val3)]
The number of columns is not known in advance, so we can have n columns. We can try this approach:
if n=1:
selection = (df['col1'] >= val1))
elif n=2:
selection = (df['col1'] >= val1) & (df['col2'] >= val2))
elif n=3:
selection = (df['col1'] >= val1) & (df['col2'] >= val2) & (df['col3'] >= val3))
But this is neither scalable nor efficient. I tried by generating strings df['col<>'] >= val<>)
with a for loop given the input lists but it didn't work for Pandas because of the str
format.
What would be the best pythonic approach for this? To avoid having all the options with if and else statements.