1

I want to delete the duplicate rows in a table using jdbc.

I just tried to below code which is deleting all the records in a table

   String query="Delete from tests where product_id=20 and product_name='KINDLE001'";

Is there any way to delete the duplicate records and keep the last duplicate value in table.

Like there is a column in table called S.NO 1,2,3 and 1,2 rows are duplicate then I just want to delete record 1. And keep 2, 3 unique records as It's.

Any leads?

ArrchanaMohan
  • 2,314
  • 4
  • 36
  • 84

4 Answers4

1

The general way how to attack the dups in a table is as follows.

1) define the unique key column(s) - I use in my example KEY1, KEY2

2) define the column identifying the order - the highest value is preserved, all other values are considered as dups. I use ORDER1

Example

create table tab as
select 1 key1, 1 key2, 1 order1 from dual union all  -- dup
select 1 key1, 1 key2, 2 order1 from dual union all  -- dup
select 1 key1, 1 key2, 3 order1 from dual union all
select 1 key1, 2 key2, 1 order1 from dual union all
select 2 key1, 1 key2, 1 order1 from dual union all  -- dup
select 2 key1, 1 key2, 2 order1 from dual union all
select 2 key1, 2 key2, 1 order1 from dual;

This query identifies the duplicated rows

select KEY1, KEY2, ORDER1 from 
 (select tab.*,
   row_number()  over (partition by key1, key2 order by order1 desc) as rn
  from tab)
where rn > 1

      KEY1       KEY2     ORDER1
---------- ---------- ----------
         1          1          2 
         1          1          1 
         2          1          1

and this query deletes the dulicates

delete from tab where (KEY1, KEY2, ORDER1) in
(select KEY1, KEY2, ORDER1 from 
 (select tab.*,
   row_number()  over (partition by key1, key2 order by order1 desc) as rn
  from tab)
where rn > 1)

Substitute your table and columns names for TAB, KEY1, KEY2 and ORDER1.

Marmite Bomber
  • 19,886
  • 4
  • 26
  • 53
1

Depends on how big is your database. If you want to have a very fast result, it is better to use batch commands. For example you have a simple table with the fields: id, name, surname and age. Again suppose you want to delete all duplicated name fields and keep only the oldest person (I just wanted to give an example so it has all the criteria). First you create a model class of your table (I always prefer to define a model class). Then, create getters , setters and finally constructors that you need.

  public class modelRecord {
        int id;
        int age;
        String name;
        String surname;
      public modelRecord() {
            super();
        }
      public modelRecord(int age, String name) {
        super();
        this.age= age;
        this.name= name;
    }
    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age= age;
    }
// the same for other fields
    }

Crate another class as your DAO codes. Next, you can search your table to create a list of your record duplication. suppose that you called your data model modelRecord.class then it means that you have to define a function which returns a List of this class type:

public List<modelRecord>  ListDuplicatedReviewId() {
   
   Connection con=null;
    ResultSet rs=null;
    List<modelRecord> list=new ArrayList<modelRecord>();
    PreparedStatement ps=null;


 // your code for accessing to db   DeleteQuery="SELECT max(age), name, COUNT(*) countNum FROM table_name GROUP BY `name` HAVING countNum > 1"; ps=con.prepareStatement(DeleteQuery); rs = ps.executeQuery();
                while (rs.next()) {
                    int maxAge=rs.getInt(1);
                    String name=rs.getString("name");
                    modelRecordrec=new modelRecord(age, name);
                    list.add(rec);
                } // close all your db connection resources
return list;
    }

Then, you can use this list to apply a batch delete in mysql:

public void DeleteBatch(List<modelRecord> list) throws SQLException {
    final String deleteQueryBatch="Delete from table_name where name=? AND age <?";

    try(Connection con=// get your connection to db with batch command support)){
        try(PreparedStatement ps=con.prepareStatement(deleteQueryBatch)){
        
            con.setAutoCommit(false);
        
            for (modelRecord rec : list) {
                ps.setString(1, rec.getName() );
                
                ps.setInt(2, rec.getAge());
                
                ps.addBatch();
            
            }
            
            
            int[] result =ps.executeBatch();
            int sum1 = IntStream.of(result).sum();
            System.out.println("PreparedStatement Batch executed, DELETE done= " + sum1);
            con.commit();
        
        } catch (SQLException e) {
            con.rollback();
            System.out.println(e);
        }
    }
    
}

Finally you need a class to call these two methods. I Hope I explained Clearly.

Now Second method in case your table does not have a lot of records. do a normal inner join delete and you do not need batch command. in order to do that you just need a copy of your table. I write here just the query:

DeleteQuery=" delete table1 from table1 inner join table2 on table1.id=table2.id where table1.name=table2.name and table1.age<table2.age"
Shila Mosammami
  • 999
  • 6
  • 20
0

You can use the below method as mentioned in here

Select all unique rows
Copy them to a new temp table
Truncate original table
Copy temp table data to original table
Yug Singh
  • 3,112
  • 5
  • 27
  • 52
  • can you give me some example. I don't have privilege to create any new table. just have to delete duplicates and keep one single record – ArrchanaMohan Sep 20 '18 at 18:25
0
private void delete_duplicatesActionPerformed(java.awt.event.ActionEvent evt) {                                                  

    Connection connection = getConnection();
    try {
            String querydi = "DELETE t1 FROM winner_loser AS t1 ,winner_loser AS t2 where t1.date='"+getdate1.getText()+"' AND t2.date='"+getdate1.getText()+"'  AND t1.k=t2.k AND t1.sn>t2.sn  ";
            pstmt = (PreparedStatement) connection.prepareStatement(querydi);

            pstmt.executeUpdate();

            JOptionPane.showMessageDialog(null, "Delete Duplicates Successfully ! Reload again pls !");

            pstmt.close();
            connection.close();
        } catch (SQLException ex) {

        }     
}    

The above code run well to delete duplicates values.

Dmitriy Fialkovskiy
  • 3,065
  • 8
  • 32
  • 47