0

I would like to pass some custom conditions/masks to a Pandas dataframe from a string and I am wondering if it is possible the way I would want it to be done. Please see the example code below:

#df is just some pandas dataframe from a csv

mask = 'df['Col1'] == 1 & df['Col2'] == 'Complete'

print(df[mask])

How do I do that in a way that works? How do I turn the string into just its contents? Is there any other method? I reckon this could be useful for many applications and not only Pandas.

NOTE: I am aware that I can pass multiple arguments using a dictionary but this is not the same case.

Edgar M
  • 49
  • 1
  • 9

2 Answers2

1

do you need this?

import pandas as pd
d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)

mask = "col1==2 & col2==4"
df.query(mask)


Out[1]: 
   col1  col2
1     2     4
  • I feel stupid now. It does work like that after all. Incorrect example then but it solves the issue for pandas. I will test out some possibilities before I ask another question on the subject. Cheers – Edgar M Jan 22 '20 at 13:29
0

I don't think you need the double quotation marks furthermore I'd use parenthesis to separate the conditions. Here I provide a working example:

data = {'col1':['x','x','x','y','g'],'col2':['a','a','b','b','p'],'col3':['abc','def','efg','cfg','def']}
df = pd.DataFrame(data)
mask = (df['col1'] == 'x') & (df['col2'] == 'a')
print(df.loc[mask])

Output:

  col1 col2 col3
0    x    a  abc
1    x    a  def

You can avoid using the loc in this case too and you will get the same output. Just for ease of comparison, this is the original dataframe:

  col1 col2 col3
0    x    a  abc
1    x    a  def
2    x    b  efg
3    y    b  cfg
4    g    p  def
Celius Stingher
  • 17,835
  • 6
  • 23
  • 53