2

I have a table which shows product ID's and how many times they have been given, 1 star, 2 stars, 3 stars, 4 stars and 5 stars when reviewed by customers along with the average rating for that product. There are some duplicate rows appearing in this table. How can I remove the rows which are complete duplicates e.g. ProductId 1196585. An example of the table is below.

ProductId  | rate1 | rate2 | rate3 | rate4 | rate5 | aveRate

1294518    |   4   |   1   |   0   |   0   |   0   |   1

9226582    |   0   |   0   |   3   |   0   |   0   |   3

3946583    |   0   |   0   |   0   |   1   |   0   |   4

7392588    |   1   |   0   |   0   |   0   |   6   |   5

1196585    |   0   |   0   |   2   |   3   |   9   |   5

1196585    |   0   |   0   |   2   |   3   |   9   |   5

For clarification I want to modify the table and I will make a copy of it first.

Janak
  • 4,986
  • 4
  • 27
  • 45
Ben Paton
  • 1,432
  • 9
  • 35
  • 59
  • Do you have a unique identifier for each row? – Kickstart May 24 '13 at 08:28
  • No the table is exactly as shown, would an identifyer help with not getting duplicates in the first place? – Ben Paton May 24 '13 at 08:29
  • If there's only supposed to be one row per productId, you should set the column as `UNIQUE`. – Phylogenesis May 24 '13 at 08:29
  • Without a unique id for the row I can't see a way in SQL to delete a particular row (if there were it would be possible). As such you probably need to do in in a couple of stages as suggested by Er. Nikhil Agrawal suggests below, or to do it by adding a unique index as Devart suggests. – Kickstart May 24 '13 at 08:34

3 Answers3

3

The simpliest way is to add unique key using ALTER TABLE statement with IGNORE option -

ALTER IGNORE TABLE table_name
  ADD UNIQUE INDEX (ProductId, rate1, rate2, rate3, rate4, rate5, aveRate);

From the documentation - IGNORE is a MySQL extension to standard SQL. It controls how ALTER TABLE works if there are duplicates on unique keys in the new table or if warnings occur when strict mode is enabled. If IGNORE is not specified, the copy is aborted and rolled back if duplicate-key errors occur. If IGNORE is specified, only the first row is used of rows with duplicates on a unique key, The other conflicting rows are deleted. Incorrect values are truncated to the closest matching acceptable value.

Devart
  • 119,203
  • 23
  • 166
  • 186
1

Follow these step to accomplish this task.

  1. create a temp table

    create table temp_table_name as select distinct * from table_name;

  2. Drop your table

    drop table table_name;

  3. Rename the temproary table.

    rename temp_table_name to table_name;

Nikhil Agrawal
  • 26,128
  • 21
  • 90
  • 126
1

Further to my earlier comment about a unique ID field, I had a play trying to come up with a normal SQL way of doing this in one statement and without the need to temporarily alter the table if you had such an id.

DELETE ProductRate 
FROM ProductRate
INNER JOIN (
SELECT id, ProductId, rate1, rate2, rate3, rate4, rate5, aveRate, 
@Counter := (
            CASE
                WHEN ProductId != @ProductId THEN 0
                WHEN rate1 != @rate1 THEN 0
                WHEN rate2 != @rate2 THEN 0
                WHEN rate3 != @rate3 THEN 0
                WHEN rate4 != @rate4 THEN 0
                WHEN rate5 != @rate5 THEN 0
                WHEN aveRate != @aveRate THEN 0
                ELSE @Counter + 1
            END
            ) AS RecCounter,
@ProductId:=ProductId, 
@rate1:=rate1, 
@rate2:=rate2, 
@rate3:=rate3, 
@rate4:=rate4, 
@rate5:=rate5, 
@aveRate:=aveRate
FROM ProductRate,
(SELECT @Counter:=0, @ProductId:=0, @rate1:=0, @rate2:=0, @rate3:=0, @rate4:=0, @rate5:=0, @aveRate:=0) Deriv1
ORDER BY ProductId, rate1, rate2, rate3, rate4, rate5, aveRate) Deriv2
ON ProductRate.id = Deriv2.id
WHERE Deriv2.RecCounter >0

Note that this assumes you do have a unique id on the table. You can easily add one by:-

ALTER TABLE `ProductRate` ADD `id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY FIRST

I have only done this out of interest and for a one off job I would be happy using the suggestions from either Er. Nikhil Agrawal or Devart. However if this is a regular issue which can occur but only occasionally needs cleaning up, adding the unique meaningless id and using this SQL might be worthwhile.

Kickstart
  • 21,403
  • 2
  • 21
  • 33