Java: Detect duplicates in ArrayList?

Question

How could I go about detecting (returning true/false) whether an ArrayList contains more than one of the same element in Java?

Many thanks, Terry

Edit Forgot to mention that I am not looking to compare "Blocks" with each other but their integer values. Each "block" has an int and this is what makes them different. I find the int of a particular Block by calling a method named "getNum" (e.g. table1[0][2].getNum();

If "Block" is compared by an int, you should probably have hashCode return that same int and have equals compare those ints. — Paul Tomblin, Feb 19 '09 at 01:26

Paul Tomblin · Accepted Answer · 2019-08-28T19:37:57.217

229

Simplest: dump the whole collection into a Set (using the Set(Collection) constructor or Set.addAll), then see if the Set has the same size as the ArrayList.

List<Integer> list = ...;
Set<Integer> set = new HashSet<Integer>(list);

if(set.size() < list.size()){
    /* There are duplicates */
}

Update: If I'm understanding your question correctly, you have a 2d array of Block, as in

Block table[][];

and you want to detect if any row of them has duplicates?

In that case, I could do the following, assuming that Block implements "equals" and "hashCode" correctly:

for (Block[] row : table) {
   Set set = new HashSet<Block>(); 
   for (Block cell : row) {
      set.add(cell);
   }
   if (set.size() < 6) { //has duplicate
   }
}

I'm not 100% sure of that for syntax, so it might be safer to write it as

for (int i = 0; i < 6; i++) {
   Set set = new HashSet<Block>(); 
   for (int j = 0; j < 6; j++)
    set.add(table[i][j]);
 ...

Set.add returns a boolean false if the item being added is already in the set, so you could even short circuit and bale out on any add that returns false if all you want to know is whether there are any duplicates.

edited Aug 28 '19 at 19:37

answered Feb 18 '09 at 21:25

Paul Tomblin

179,021
58
319
408

14

Make sure to implement hashCode/equals as well. – jon077 Feb 18 '09 at 21:27
1

Or even a bit easier: wrap it when creating the set, e.g. new HashSet(list), instead of using addAll. – Fabian Steeg Feb 18 '09 at 21:28
2

@jon077: That depends on your definition of "duplicate". – Michael Myers Feb 18 '09 at 21:29
Would the process of detecting the elements in a 2D array be the same? For example, checking from array[0][0] to array[0][6] (a 'row')..? Many thanks, Terry – Feb 18 '09 at 21:29
Each object in the array holds an integer value. By "duplicate", the object would have the same integer value. – Feb 18 '09 at 21:30
Terry, there are probably more efficient ways, but yes, you could add the array to a Set and compare Set.size to array.length. – Paul Tomblin Feb 18 '09 at 21:33
"Duplicate" has multiple meanings for objects (as opposed to primitives). If you're just using ints, or the Integer class which implements equals(), that shouldn't be an issue. – Nikhil Feb 18 '09 at 21:35
I'm trying to implement the solution above but it seems to be triggering an error message which says: "myFile.java uses unchecked or unsafe operations" Not quite sure what it meansor what to do! – Feb 18 '09 at 22:22
I think we'd have to see the code you actually wrote. The snippet in the answer isn't intended to compile as written, but what was written, should compile fine, provided you filled in the bits well. – Paul Brinkley Feb 18 '09 at 23:51
You can't turn a table into a list like that. Look at the Arrays class. – Paul Tomblin Feb 19 '09 at 00:20
If you're trying to turn the entire 36 Block objects into a single list, you need to loop. – Paul Tomblin Feb 19 '09 at 00:22
Comments aren't a good place to solve a secondary problem like this. Edit your question and I'll edit the answer, or ask a new question. – Paul Tomblin Feb 19 '09 at 00:23
Ah, I see that a "table" is a different thing entirely. Assume that "table1" is the name given to 2D array of "Block" objects – Feb 19 '09 at 00:24
Sorry about the clutter in the comments, I've now edited my question. Thank you for the help so far! – Feb 19 '09 at 00:34
Ok, so from what I can tell, the loops add each 'row' to a Set. So am i correct in thinking that now all that remains to do is compare each Set to the size of table1 (and if Set's size is less than the size of table1's size then there are duplicates)? – Feb 19 '09 at 00:43
You said you wanted to find out if there were any duplicates *within* a row, not over the whole table. That's why I compare the set.size() to 6. – Paul Tomblin Feb 19 '09 at 00:52
Update: Solution implemented and working! Thank you all for your input on this problem and special thanks to Paul for going out of your way to help! Warm Regards, Terry – Feb 19 '09 at 01:13

akuhn · Answer 2 · 2013-10-10T04:35:15.183

66

Improved code, using return value of Set#add instead of comparing the size of list and set.

public static <T> boolean hasDuplicate(Iterable<T> all) {
    Set<T> set = new HashSet<T>();
    // Set#add returns false if the set does not change, which
    // indicates that a duplicate element has been added.
    for (T each: all) if (!set.add(each)) return true;
    return false;
}

edited Oct 10 '13 at 04:35

answered Mar 01 '09 at 19:13

akuhn

27,477
2
76
91

7

Would it be more efficient to tell the HashSet how much space to allocate: `Set set = new HashSet(list.size());`? Given a List parameter I think it is more efficient if it is common for the list to not contain duplicates. – Paul Jackson Apr 28 '14 at 22:50
1

@PaulJackson Sizing based on the full list will probably be beneficial. However if the common case is for it to find a duplicate early then the space was wasted. Also even sizing the `HashSet` to the size of the list will result in resizing when running through the whole list because of the underlying loading factor of the hash structure. – Jay Anderson Mar 22 '18 at 21:19
1

Unless you experience actual issues with runtime or space I would not finetune your code like that. Premature optimization is best avoided. – akuhn Mar 22 '18 at 22:51

score 22 · Answer 3 · answered Feb 20 '19 at 19:05

22

With Java 8+ you can use Stream API:

boolean areAllDistinct(List<Block> blocksList) {
    return blocksList.stream().map(Block::getNum).distinct().count() == blockList.size();
}

answered Feb 20 '19 at 19:05

Sergiy Dakhniy

538
1
4
15

score 15 · Answer 4 · answered Feb 18 '09 at 21:30

15

If you are looking to avoid having duplicates at all, then you should just cut out the middle process of detecting duplicates and use a Set.

answered Feb 18 '09 at 21:30

matt b

138,234
66
282
345

1

Make sure to implement hashCode/equals :) – jon077 Feb 18 '09 at 21:36
@jon077: Not necessarily, as I just said. – Michael Myers Feb 18 '09 at 21:38
1

However using a Set does not _detect_ duplicates. It just prevents them. Unless of course you check the result of the add method as noted by @akuhn above. – mcallahan Oct 20 '17 at 17:51

score 13 · Answer 5 · edited Nov 22 '11 at 12:56

Improved code to return the duplicate elements

Can find duplicates in a Collection
return the set of duplicates
Unique Elements can be obtained from the Set

public static <T> List getDuplicate(Collection<T> list) {

    final List<T> duplicatedObjects = new ArrayList<T>();
    Set<T> set = new HashSet<T>() {
    @Override
    public boolean add(T e) {
        if (contains(e)) {
            duplicatedObjects.add(e);
        }
        return super.add(e);
    }
    };
   for (T t : list) {
        set.add(t);
    }
    return duplicatedObjects;
}


public static <T> boolean hasDuplicate(Collection<T> list) {
    if (getDuplicate(list).isEmpty())
        return false;
    return true;
}

That's pretty awesome. you have some invalid code, and maybe it's not the most optimal way, but your approach totally rocks! (and it works great) — Jules Colle, Oct 03 '12 at 19:23

Jay Anderson · Answer 6 · 2016-12-02T20:27:47.107

11

I needed to do a similar operation for a Stream, but couldn't find a good example. Here's what I came up with.

public static <T> boolean areUnique(final Stream<T> stream) {
    final Set<T> seen = new HashSet<>();
    return stream.allMatch(seen::add);
}

This has the advantage of short-circuiting when duplicates are found early rather than having to process the whole stream and isn't much more complicated than just putting everything in a Set and checking the size. So this case would roughly be:

List<T> list = ...
boolean allDistinct = areUnique(list.stream());

edited Dec 02 '16 at 20:27

answered Jan 25 '16 at 20:24

Jay Anderson

937
1
8
18

3

Could be even shorter: `return stream.allMatch(new HashSet<>()::add);` – Jezor Jan 08 '21 at 13:20

Varkhan · Answer 7 · 2009-02-18T22:05:30.243

If your elements are somehow Comparable (the fact that the order has any real meaning is indifferent -- it just needs to be consistent with your definition of equality), the fastest duplicate removal solution is going to sort the list ( 0(n log(n)) ) then to do a single pass and look for repeated elements (that is, equal elements that follow each other) (this is O(n)).

The overall complexity is going to be O(n log(n)), which is roughly the same as what you would get with a Set (n times long(n)), but with a much smaller constant. This is because the constant in sort/dedup results from the cost of comparing elements, whereas the cost from the set is most likely to result from a hash computation, plus one (possibly several) hash comparisons. If you are using a hash-based Set implementation, that is, because a Tree based is going to give you a O( n log²(n) ), which is even worse.

As I understand it, however, you do not need to remove duplicates, but merely test for their existence. So you should hand-code a merge or heap sort algorithm on your array, that simply exits returning true (i.e. "there is a dup") if your comparator returns 0, and otherwise completes the sort, and traverse the sorted array testing for repeats. In a merge or heap sort, indeed, when the sort is completed, you will have compared every duplicate pair unless both elements were already in their final positions (which is unlikely). Thus, a tweaked sort algorithm should yield a huge performance improvement (I would have to prove that, but I guess the tweaked algorithm should be in the O(log(n)) on uniformly random data)

In this case, n is 6 so I wouldn't waste a lot of time on implementation details, but I'll keep your idea of the special heap sort if I ever need to do something like that. — Paul Tomblin, Feb 19 '09 at 01:25
I don't understand the third paragraph. Mergesort and heapsort are both O(nlog(n)), not O(log(n)) as you write; even if you exit once you identify a duplicate, that still doesn't change your time complexity... — ChaimKut, May 06 '12 at 14:33
Commenting just to remove confusing info from the internet. The time complexity of the accepted answer is actually O(n) when using insertion to HashSet. The memory complexity is doubled, but still O(n)... you do not need to sort anything unless you have a requirement of minimalistic memory impact... — Petr Dvořák, May 28 '21 at 13:54

score 2 · Answer 8 · edited Feb 18 '19 at 16:23

If you want the set of duplicate values:

import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

public class FindDuplicateInArrayList {

    public static void main(String[] args) {

        Set<String> uniqueSet = new HashSet<String>();
        List<String> dupesList = new ArrayList<String>();
        for (String a : args) {
            if (uniqueSet.contains(a))
                dupesList.add(a);
            else
                uniqueSet.add(a);
        }
        System.out.println(uniqueSet.size() + " distinct words: " + uniqueSet);
        System.out.println(dupesList.size() + " dupesList words: " + dupesList);
    }
}

And probably also think about trimming values or using lowercase ... depending on your case.

Simplest and best answer if you want the duplicates, for performance you may init uniqueSet hint with size of args. — Christophe Roussy, Feb 18 '19 at 16:21

score 1 · Answer 9 · edited Jul 16 '17 at 04:09

1

best way to handle this issue is to use a HashSet :

ArrayList<String> listGroupCode = new ArrayList<>();
listGroupCode.add("A");
listGroupCode.add("A");
listGroupCode.add("B");
listGroupCode.add("C");
HashSet<String> set = new HashSet<>(listGroupCode);
ArrayList<String> result = new ArrayList<>(set);

Just print result arraylist and see the result without duplicates :)

edited Jul 16 '17 at 04:09

Roy Wasse

390
1
10

answered Aug 25 '16 at 05:03

Ashana.Jackol

3,064
28
22

score 1 · Answer 10 · answered Feb 19 '09 at 01:12

1

Simply put: 1) make sure all items are comparable 2) sort the array 2) iterate over the array and find duplicates

answered Feb 19 '09 at 01:12

Antonio

2,406
1
16
8

That's too expensive considering algorithmic complexity. Best sorting is O(n*log(n)) while you can find duplicates with O(n) complexity. – Sergiy Dakhniy Aug 27 '20 at 21:04

score 1 · Answer 11 · answered Jun 10 '19 at 20:07

This answer is wrriten in Kotlin, but can easily be translated to Java.

If your arraylist's size is within a fixed small range, then this is a great solution.

var duplicateDetected = false
    if(arrList.size > 1){
        for(i in 0 until arrList.size){
            for(j in 0 until arrList.size){
                if(i != j && arrList.get(i) == arrList.get(j)){
                    duplicateDetected = true
                }
            }
        }
    }

score 1 · Answer 12 · answered Dec 16 '19 at 11:07

private boolean isDuplicate() {
    for (int i = 0; i < arrayList.size(); i++) {
        for (int j = i + 1; j < arrayList.size(); j++) {
            if (arrayList.get(i).getName().trim().equalsIgnoreCase(arrayList.get(j).getName().trim())) {
                return true;
            }
        }
    }

    return false;
}

score 1 · Answer 13 · answered Feb 03 '12 at 18:57

To know the Duplicates in a List use the following code:It will give you the set which contains duplicates.

 public Set<?> findDuplicatesInList(List<?> beanList) {
    System.out.println("findDuplicatesInList::"+beanList);
    Set<Object> duplicateRowSet=null;
    duplicateRowSet=new LinkedHashSet<Object>();
            for(int i=0;i<beanList.size();i++){
                Object superString=beanList.get(i);
                System.out.println("findDuplicatesInList::superString::"+superString);
                for(int j=0;j<beanList.size();j++){
                    if(i!=j){
                         Object subString=beanList.get(j);
                         System.out.println("findDuplicatesInList::subString::"+subString);
                         if(superString.equals(subString)){
                             duplicateRowSet.add(beanList.get(j));
                         }
                    }
                }
            }
            System.out.println("findDuplicatesInList::duplicationSet::"+duplicateRowSet);
        return duplicateRowSet;
  }

score 0 · Answer 14 · answered Oct 30 '13 at 05:35

    String tempVal = null;
    for (int i = 0; i < l.size(); i++) {
        tempVal = l.get(i); //take the ith object out of list
        while (l.contains(tempVal)) {
            l.remove(tempVal); //remove all matching entries
        }
        l.add(tempVal); //at last add one entry
    }

Note: this will have major performance hit though as items are removed from start of the list. To address this, we have two options. 1) iterate in reverse order and remove elements. 2) Use LinkedList instead of ArrayList. Due to biased questions asked in interviews to remove duplicates from List without using any other collection, above example is the answer. In real world though, if I have to achieve this, I will put elements from List to Set, simple!

faizal · Answer 15 · 2014-11-10T08:15:44.787

/**
     * Method to detect presence of duplicates in a generic list. 
     * Depends on the equals method of the concrete type. make sure to override it as required.
     */
    public static <T> boolean hasDuplicates(List<T> list){
        int count = list.size();
        T t1,t2;

        for(int i=0;i<count;i++){
            t1 = list.get(i);
            for(int j=i+1;j<count;j++){
                t2 = list.get(j);
                if(t2.equals(t1)){
                    return true;
                }
            }
        }
        return false;
    }

An example of a concrete class that has overridden equals() :

public class Reminder{
    private long id;
    private int hour;
    private int minute;

    public Reminder(long id, int hour, int minute){
        this.id = id;
        this.hour = hour;
        this.minute = minute;
    }

    @Override
    public boolean equals(Object other){
        if(other == null) return false;
        if(this.getClass() != other.getClass()) return false;
        Reminder otherReminder = (Reminder) other;
        if(this.hour != otherReminder.hour) return false;
        if(this.minute != otherReminder.minute) return false;

        return true;
    }
}

score 0 · Answer 16 · answered Jul 04 '17 at 04:22

    ArrayList<String> withDuplicates = new ArrayList<>();
    withDuplicates.add("1");
    withDuplicates.add("2");
    withDuplicates.add("1");
    withDuplicates.add("3");
    HashSet<String> set = new HashSet<>(withDuplicates);
    ArrayList<String> withoutDupicates = new ArrayList<>(set);

    ArrayList<String> duplicates = new ArrayList<String>();

    Iterator<String> dupIter = withDuplicates.iterator();
    while(dupIter.hasNext())
    {
    String dupWord = dupIter.next();
    if(withDuplicates.contains(dupWord))
    {
        duplicates.add(dupWord);
    }else{
        withoutDupicates.add(dupWord);
    }
    }
  System.out.println(duplicates);
  System.out.println(withoutDupicates);

Add some explanation with answer for how this answer help OP in fixing current issue — ρяσѕρєя K, Jul 04 '17 at 04:44

score -1 · Answer 17 · answered Feb 21 '23 at 21:06

A simple solution for learners. //Method to find the duplicates.

public static List<Integer> findDublicate(List<Integer> numList){
    List<Integer> dupLst = new ArrayList<Integer>();
    //Compare one number against all the other number except the self.
    for(int i =0;i<numList.size();i++) {
        for(int j=0 ; j<numList.size();j++) {
            if(i!=j && numList.get(i)==numList.get(j)) {
                boolean isNumExist = false;
                //The below for loop is used for avoid the duplicate again in the result list
                for(Integer aNum: dupLst) {
                    if(aNum==numList.get(i)) {
                        isNumExist = true;
                        break;
                    }
                }
                if(!isNumExist) {
                    dupLst.add(numList.get(i));
                }
            }
        }
    }
    return dupLst;
}

Basically a duplicate of [this answer](https://stackoverflow.com/a/9133979/1553851) with the bonus inefficiency of using a result list and checking for duplicates by hand, plus incorrect object comparison. — shmosel, Feb 21 '23 at 21:11

Java: Detect duplicates in ArrayList?

17 Answers17

Linked

Related