I'm new to both Kaggle and Python and can't figure out how to convert this data set. For anyone familiar, I'm trying to reproduce the gender based solution for the Titanic tutorial.
I have:
submission = pd.DataFrame({'PassengerId' : test_data.PassengerId, 'Survived' : final_prediction})
print(submission.head())
Which gives me:
PassengerId Survived
0 892 0.184130
1 893 0.761143
2 894 0.184130
3 895 0.184130
4 896 0.761143
Which I need to convert to:
PassengerId Survived
0 892 0
1 893 1
2 894 0
3 895 0
4 896 1
Again, not really knowing Python, I have tried some solutions like:
for x in np.nditer(final_prediction, op_flags=['readwrite']):
x[...]=(1 if x[...] >= 0.50 else 0)
Which gives me floating point like: (and still shows in CSV file as 0.0, 1.0)
PassengerId Survived
0 892 0.
1 893 1.
And:
rounded_prediction = np.rint(final_prediction)
Gives me the same (i.e. 0., 1.)
The following:
int_prediction = final_prediction.astype(int)
Gives me all 0's
Any ideas? Thanks!