0

I'm able to get a log loss score as low as 0.24 in training and 0.38 in validation, but once I submit my predictions to Kaggle to score on the test set, the loss is way off (sometimes as high as 4, but mostly never below 0.69). Any ideas what could be going wrong?

FYI, here is my code for writing the predictions to file (using Keras & Tensorflow backend):

predictions = model.predict(test, verbose=0)
preds = pd.DataFrame({
    "label": map(lambda x: x[0], predictions)
})
preds.index += 1
preds.to_csv('submissions/submission.csv', index_label="id")

Thank you!

cchamberlain
  • 17,444
  • 7
  • 59
  • 72
Anas
  • 866
  • 1
  • 13
  • 23
  • There might be nothing wrong, just that your model is overfitting and performs poorly on test data that is different from the training set. – Dr. Snoopy Jan 11 '17 at 23:52
  • @MatiasValdenegro Right, I'm sure that could be the case. But to be this far off? I've used data augmentation to randomly modify my training set and generate more samples, but no luck. – Anas Jan 12 '17 at 15:09

0 Answers0