1

I am doing a binary classification, predicted values are 0 and 1, is there is any way to get features values for a prediction value.

for eg: I have 2 features 'Age' and 'Salary' and target value is 'purchased'. Age Salary Purchased 19 19000 0 35 20000 0 27 30000 1 41 29000 1 65 40000 1

So, I want to know for each test case outcome (0 or 1) what were features values (Age and Salary).

import pandas as pd

df = pd.read_csv('data.csv')
x = df.iloc[:,[0,1]]
y = df.iloc[:,2]


from sklearn.cross_validation import train_test_split

x_train,x_test,y_train,y_test =train_test_split(x,y,test_size=0.25,random_state=0)


from sklearn.linear_model import LogisticRegression
regressor = LogisticRegression()
regressor.fit(x_train,y_train)

y_pred=regressor.predict(x_test)
Roman Soviak
  • 791
  • 2
  • 9
  • 30
H.Banik
  • 23
  • 6
  • 1
    When you say you want to know, what do you mean by this? `x_test` has all of your features at the same index as their predictions in `y_pred`. Are you looking for a way to combine them into a single data structure? Create a visualization of their relationship? – morsecodist Feb 18 '18 at 19:41
  • You can take a look here: http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html, maybe you want something to do with `regressor.coef_` or `regressor.intercept_` – joaoavf Feb 19 '18 at 04:26
  • yes,i want to know how to combine x_test and y_pred so that end user can visualize it clearly @morsecodist – H.Banik Feb 19 '18 at 06:12

1 Answers1

2

Based on your clarification that you just want to put them in the same data structure. You can concatenate two dataframes with pandas. But you need to put the predictions within a dataframe with the appropriate index. Here is the code:

y_pred_df = pd.DataFrame(y_pred, index=x_test.index)
pd.concat([x_test, y_pred_df], axis=1)
morsecodist
  • 907
  • 7
  • 14