0

Getting submission error:

ERROR: The value '7.63E+15' in the key column 'ID' has already been defined (Line 23029, Column 1).

Link to the challenge : https://www.kaggle.com/c/santander-value-prediction-challenge.

Head of the submission file:

          ID         target
0      000137c73  5.944923e+06
1      00021489f  5.944923e+06
2      0004d7953  5.944923e+06
3      00056a333  5.944923e+06
4      00056d8eb  5.944923e+06
Faraz Gerrard Jamal
  • 238
  • 1
  • 3
  • 14

2 Answers2

1

I guess you have used excel or LibreOffice Calc. Opening file in excel to view the output will collapse your format. Generally the best thing to do is avoid Excel entirely. Are you using Python? Easiest thing do is load the sample submission, replace the target column, and save:

ss = pd.read_csv('sample_submission.csv')
ss.loc[:, 'target'] = preds

ss.to_csv('sub.csv',
      index=False)
Prashant Gupta
  • 439
  • 2
  • 10
0

This error occurs because there must be a redundant value in the file. Go through the submission info to check the shape of the submission file and then verify weather your file has the same no. of dimensions(it should be one row extra i.e. row number 23029 where the 'Id' column has a redundant value). Try deleting duplicate values. Worked for me.