1

I've searched all over and can't seem to figure this out, so here is my first Stack Exchange question.

I'm using a java program to run the bulk load process, but I've also tried it straight from my sql client, MySQL Workbench, and I get the same error:

LOAD DATA INFILE '/path/to/file/infile.csv' 
INTO TABLE t1
FIELDS TERMINATED BY ',' 
LINES TERMINATED BY '\n'
(category, item, date_time, v1, v2, v3);

Error:

Error Code: 1062. Duplicate entry ''Book'-'Fiction'-2014-04-16 09:33:00' for key 'PRIMARY'

Using my sql client I've confirmed that there is no such current record in the table, in fact I don't have any records for the same category-type pair in the same month. I have many (~16,000) CSV files to load into my MySQL database each month, each file corresponds to a separate category-type pair with different values over the course of the month. I have been successful with this method so far having loaded over 50 million records, however I can't seem to load any more without getting this same error.

My table uses 3 fields to create the PRIMARY key, 2 varchar() and a datetime

'CREATE TABLE `t1` (
  `category` varchar(10) NOT NULL,
  `item` varchar(15) NOT NULL DEFAULT '''',
  `date_time` datetime NOT NULL,
  `v1` double DEFAULT NULL,
  `v2` double DEFAULT NULL,
  `v3` double DEFAULT NULL,
  PRIMARY KEY (`category`,`type`,`date_time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1'
  • I have worked with databases in the past, but nowhere near this many records, I don't know if that is the problem.

  • I could switch to using an auto-incremented id for my primary key, but it may take up more room considering the large number of records and I may get duplicates for my (category, item, date_time) which would be problematic.

  • I know that MySQL permits a “relaxed” format for values specified as strings, and I may need to do some additional formatting to figure this out.

  • I deleted the first line of my csv file with the value ''Book'-'Fiction'-2014-04-16 09:33:00', but then I get the same 1062 error for the next date time value ''Book'-'Fiction'-2014-04-16 09:35:00'

  • I thought it might be the way I am formatting my Datetime string but I am using the "YYYY-MM-DD HH:MM:SS" format which has worked on thousands of other LOAD DATA INFILE. Just to be safe I tried using the STR_TO_DATE() function see below

    LOAD DATA INFILE '/path/to/file/infile.csv' INTO TABLE t1 FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' (category, item, @date_var, v1, v2, v3) SET date_time = STR_TO_DATE(@date_var, '%Y-%m-%d %H:%i:%s');

Any help would be appreciated.

GothamNite
  • 101
  • 1
  • 5
  • 3
    The error seems pretty clear: you have more than one row in the file with the same `category, type, date_time`. No amount of reformatting will fix this, you need to do something about the duplicates, or use the `IGNORE` option to skip the duplicates. – Barmar May 01 '14 at 03:34
  • You input file is full of duplicates. – Namphibian May 01 '14 at 04:33
  • Thanks @Barmar, however when I search for the duplicates I can't find them in the table. The IGNORE option will help when I do have duplicates in the future – GothamNite May 01 '14 at 13:24
  • 1
    How do you have a column in your primary key that isn't in the table? There's no `type` column in the table, should that be `item`? – Barmar May 01 '14 at 14:44
  • There's not enough information here to figure out what could be causing this error if you don't see the duplicates. There are lots of related questions in the sidebar, I suggest you go through them and see if any of them are helpful (one of the suggests looking for triggers that create additional rows). – Barmar May 01 '14 at 14:49

0 Answers0