Log loss score in training & validation very different from score on Kaggle test set

Asked Jan 11 '17 at 22:13

Active Mar 02 '17 at 16:54

Viewed 651 times

I'm able to get a log loss score as low as 0.24 in training and 0.38 in validation, but once I submit my predictions to Kaggle to score on the test set, the loss is way off (sometimes as high as 4, but mostly never below 0.69). Any ideas what could be going wrong?

FYI, here is my code for writing the predictions to file (using Keras & Tensorflow backend):

predictions = model.predict(test, verbose=0)
preds = pd.DataFrame({
    "label": map(lambda x: x[0], predictions)
})
preds.index += 1
preds.to_csv('submissions/submission.csv', index_label="id")

Thank you!

edited Mar 02 '17 at 16:54

cchamberlain

17,444
7
59
72

asked Jan 11 '17 at 22:13

Anas

There might be nothing wrong, just that your model is overfitting and performs poorly on test data that is different from the training set. – Dr. Snoopy Jan 11 '17 at 23:52
@MatiasValdenegro Right, I'm sure that could be the case. But to be this far off? I've used data augmentation to randomly modify my training set and generate more samples, but no luck. – Anas Jan 12 '17 at 15:09

Log loss score in training & validation very different from score on Kaggle test set

0 Answers0