0

I created two MySQL tables. One to import a CSV in, called staging_part_type and one where the actual part_type records are supposed to end up called part_type, minus any records that have a product_code already present. Both tables are identical, consisting of the columnsid, in_use, default_name, product_code.

Is there a specific name for doing something like this? I haven't found any complete tutorials on this, only a bunch of separate ones and I'm failing to put them together in a working query as I lack experience in this subject and after a week of reading, I still barely understand what I'm doing.

Something like this?

INSERT IGNORE INTO part_type (in_use, default_name, product_code)
        SELECT (in_use, default_name, product_code)
            FROM staging_part_type      
            WHERE part_type.product_code IS NOT staging_part_type.product_code;

Code to create the table, both tables are identical except for the name.

CREATE TABLE IF NOT EXISTS `part_type` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `in_use` tinyint(1) NOT NULL,
  `default_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `product_code` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=580 ;

Can someone please explain to me how to do this or point me to a resource on this subject?

Isaac Bennetch
  • 11,830
  • 2
  • 32
  • 43
G_V
  • 2,396
  • 29
  • 44

3 Answers3

3

As is often the case in SQL, there are two ways to do this, with a Subquery and with a join.

The subquery method is easier to understand and write, but the join method is more efficient and what you'd do if this was a mission critical query.

Since you're just importing a simple CSV, we will use a "correlated subquery" with "exists", which works like this:

INSERT INTO part_type (in_use, default_name, product_code)
    SELECT in_use, default_name, product_code
        FROM staging_part_type      
        WHERE NOT EXISTS 
          (SELECT product_code FROM part_type WHERE product_code = staging_part_type.product_code)

You were thinking along the right lines, but you need to set up two queries and link them together. The query in parentheses runs and the only results that come back are ones that don't already exist in the row. Try running the second line onwards by itself first, to see what you would get back, and then add in the first line to make the insert happen.

Professor Falken
  • 1,017
  • 11
  • 13
  • Very elegant and exactly what I needed for this situation as the file is indeed very small. I am currently trying to work my way through Fundamentals of Database Management Systems by Gillenson. It's a lot to take in but I'll definitely look into how Join differs from Subquery. – G_V Sep 06 '16 at 21:14
1

A typical method is to start with a unique index/constraint on part_type and then use on duplicate key update:

CREATE UNIQUE INDEX unq_part_type_product_code ON part_type(product_code);

INSERT INTO part_type (in_use, default_name, product_code)
     SELECT in_use, default_name, product_code
     FROM staging_part_type  
     ON DUPLICATE KEY UPDATE product_code = VALUES(product_code);

The advantages of this method. First, it works, looking up the product code using the unique index. Doing an additional lookup is not necessary (but you can do it).

Second, the SET is a non-operation, so it does not result in any increase in the number of records being modified.

Third, this only ignores errors caused by a duplication of keys (probably only product_code). Other errors are not ignored.

Fourth, this allows you to pass in other values from the staging table, if you like.

Fifth, this has no race conditions. The database ensures the uniqueness of product_code.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • I like the concept of this and I have made `product_code` unique. I am however getting a syntax error on the SET part of the statement. on http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html there seems to be a slightly different order, though it also seems to replace the original. – G_V Sep 06 '16 at 21:16
  • @G_V . . . That is because `SET` is not part of the syntax. It just looks right to me after an `UPDATE` so I sometimes add it in. – Gordon Linoff Sep 07 '16 at 01:09
1

You're looking for INSERT INTO SELECT. It's detailed some in this question and in the docs here.

Community
  • 1
  • 1
willy
  • 1,462
  • 11
  • 12