How to tell Pandas/Scikit-Learn how one field impacts predictive model

Question

I am trying to create/validate a predictive model using a fictitious dataset, using Phyton with sklearn, following this tutorial.

The dataset contains information about baseball pitcher throws, and these are the most important fields:

Result (whether the player was successful/unsuccessful in throwing a strike)
Direction (whether it was a High, Medium, or Low throw)
Other fields like speed of ball, player stats, etc.

Based on the different fields, the model will attempt to predict what direction (the Direction field) a pitcher should throw in order to get a strike.

In the tutorial I am following (the link above,) this is an example of a call to the function that generates the model, in this case for logistic regression (but we could use any of the other classification techniques listed):

outcome_var = 'Direction'
model = LogisticRegression()
predictor_var = ['Result', <insert other fields here>]
classification_model(model, df,predictor_var,outcome_var)

How can I tell the model about the negative impact (and importance) of the Result field?

Basically, if Result is "Successful", it should train the model to choose the same Direction (High/Medium/Low) if faced with the same scenario. However, if Result is "Unsuccessful", it should train the model to choose a different Direction from the one in the sample because it was not a good choice (regardless of the other fields.)

How can I tell the model how to use the Result field in order to make decisions? I can include any more details (or code) if needed. Thanks!

score 0 · Accepted Answer · answered Jan 28 '18 at 18:56

0

You don't.

The whole point of doing Machine Learning is to have the machine automatically learning relationships and rules from data.

So, they way of helping the model find such relationships is to provide it as much (correct) data as possible. With enough data, a decent model should be able to generalise and find out, in your case, whether the 'Result' field is useful or not for predicting the 'Direction' outcome.

answered Jan 28 '18 at 18:56

carrdelling

1,675
1
17
17

Thank you. I guess I was trying to take too much control of what I was doing. I appreciate it! :) – Irina Jan 29 '18 at 04:14

How to tell Pandas/Scikit-Learn how one field impacts predictive model

1 Answers1