2

I am trying to query a dataframe for it's values. My data consists of 6 columns G-p1,G-p2,G-c, H-p1,H-p2, H-c. The values for all the columns are either 'left' or 'right' as they stand for whether a parent/child has left or right handed genotype or handedness. I want to query the values where the handedness of the parents and child are left. I've tried:

test1 = pd.DataFrame(data)
test1 = test1.query({
        'H-p1': 'left',
        'H-p2': 'left',
        'H-c': 'left'})
train_data = test1
predict_data = test1
model.fit(test1)
predict_data = predict_data.copy()
predict_data.drop('H-p1', axis=1, inplace=True)
predict_data.drop('H-p2', axis=1, inplace=True)
predict_data.drop('H-c', axis=1, inplace=True)
pred = model.predict_probability(predict_data)
print(pred.to_string())

But I get this error:

ValueError: expr must be a string to be evaluated, <class 'dict'> given

Any suggestions? Thank you!

Scott
  • 43
  • 7

1 Answers1

2

query method receives a string exprpession similar to what you would use to loc filter.

Try this:

test1 = test1.query("`H-p1` == 'left' and `H-p2` == 'left' and `H-c` == 'left'")
train_data = test1

backticks `` are used to specify column names.

jcaliz
  • 3,891
  • 2
  • 9
  • 13