100

I know about SortedSet, but in my case I need something that implements List, and not Set. So is there an implementation out there, in the API or elsewhere?

It shouldn't be hard to implement myself, but I figured why not ask people here first?

naXa stands with Ukraine
  • 35,493
  • 19
  • 190
  • 259
Yuval
  • 7,987
  • 12
  • 40
  • 54
  • 1
    Why does it need to implement List? Sets are iterable, like lists, so I suppose the receiving method is enforcing List for some other reason. – Rob Nov 06 '08 at 14:04
  • @Rob That's right, it's an external demand, and the data structure includes a hell of a lot more than one List. – Yuval Nov 06 '08 at 14:13
  • If the user wants a LIST, then it's clear that needs methods of the LIST interface that are not present um the SET interface... – marcolopes Mar 04 '16 at 04:15

12 Answers12

104

There's no Java collection in the standard library to do this. LinkedHashSet<E> preserves ordering similarly to a List, though, so if you wrap your set in a List when you want to use it as a List you'll get the semantics you want.

Alternatively, the Commons Collections (or commons-collections4, for the generic version) has a List which does what you want already: SetUniqueList / SetUniqueList<E>.

naXa stands with Ukraine
  • 35,493
  • 19
  • 190
  • 259
Calum
  • 5,846
  • 2
  • 26
  • 22
  • 6
    The Commons class is exactly what I need, but my boss told me to implement it myself eventually. 10x anyway! – Yuval Nov 06 '08 at 14:19
  • 5
    Ah well, nothing like reinventing the wheel! You'll know now if the need comes up again, anyway. collections15 is a pretty useful thing to have kicking around; MultiMaps in particular ease the pain of something one ends up implementing oneself a lot. – Calum Nov 06 '08 at 17:40
  • 20
    @skaffman: he's not actually an idiot, but sometimes he makes moves that are... well, odd. Anyway, I'm not gonna introduce bugs into the product. In today's market, I'm happy with my job and not looking to slam doors and burn bridges, if you get my point. – Yuval Jul 22 '09 at 05:34
  • 3
    I'm quite surprise when SetUniqueList doesn't have parameterized type. – emeraldhieu Jun 18 '12 at 04:18
  • 2
    wrapping LinkedHashSet in a list doesn't make it easy or efficient to implement all of the index-based operations on a list. i don't think that's a reasonable approach. and also, there are reason not to pull in the commons things, for example, on a mobile platform where size matters. – Jeffrey Blattman Feb 07 '13 at 19:06
  • 3
    Jeffrey: On mobile platforms the system will usually remove unused classes, but sure, there's plenty of reasons you might not go down one of these "normal" solutions. There's always some trade-off to be made, and no solution will fix all cases. – Calum Feb 11 '13 at 15:10
  • Just keep in mind, sorting will NOT work on that "list" – miroslavign Jun 13 '18 at 15:12
20

Here is what I did and it works.

Assuming I have an ArrayList to work with the first thing I did was created a new LinkedHashSet.

LinkedHashSet<E> hashSet = new LinkedHashSet<E>()

Then I attempt to add my new element to the LinkedHashSet. The add method does not alter the LinkedHasSet and returns false if the new element is a duplicate. So this becomes a condition I can test before adding to the ArrayList.

if (hashSet.add(E)) arrayList.add(E);

This is a simple and elegant way to prevent duplicates from being added to an array list. If you want you can encapsulate it in and override of the add method in a class that extends the ArrayList. Just remember to deal with addAll by looping through the elements and calling the add method.

user2308001
  • 202
  • 1
  • 2
  • 8
user3570018
  • 201
  • 2
  • 2
  • 2
    Yeah, I think, this is the best solution for it, you can also simply use a normal HashSet, not a Linked, and then you can use your list as you want, you can also deside what to do in some situations, like in adding an element inside a list before a specific index, you can deside that you want to move the duplicated item to this position or not. – gyurix Jan 25 '15 at 14:37
  • Best solution here... Will post my UniqueList class code – marcolopes Mar 04 '16 at 05:28
  • This worked for me, in my BFS Graph algorithm. Because I had some nodes that I added to a Queue (LinkedList) just if they weren't already in. – Jeancarlo Fontalvo Aug 21 '16 at 03:07
12

So here's what I did eventually. I hope this helps someone else.

class NoDuplicatesList<E> extends LinkedList<E> {
    @Override
    public boolean add(E e) {
        if (this.contains(e)) {
            return false;
        }
        else {
            return super.add(e);
        }
    }

    @Override
    public boolean addAll(Collection<? extends E> collection) {
        Collection<E> copy = new LinkedList<E>(collection);
        copy.removeAll(this);
        return super.addAll(copy);
    }

    @Override
    public boolean addAll(int index, Collection<? extends E> collection) {
        Collection<E> copy = new LinkedList<E>(collection);
        copy.removeAll(this);
        return super.addAll(index, copy);
    }

    @Override
    public void add(int index, E element) {
        if (this.contains(element)) {
            return;
        }
        else {
            super.add(index, element);
        }
    }
}   
Yuval
  • 7,987
  • 12
  • 40
  • 54
  • 10
    Be careful - LinkedList.contains() needs to scan the entire list to determine if an object is contained in the List. This means that when you are adding objects to a large List, the entire List is scanned for each add operation (in the worst case). This can end up being SLOW. – matt b Nov 06 '08 at 16:21
  • 8
    Also, your addAll override doesn't check for duplicates in the collection being passed to addAll(). – matt b Nov 06 '08 at 16:23
  • @mattb How would you solve this problem then: On Android, when binding objects to a list item view, we are given the position of the item in the view adapter. Since sets have no index, the only way is to check whether or not the object exists when using lists is to iterate through and look for an existing copy. – TheRealChx101 Apr 05 '19 at 06:32
  • the performance-problems of this solution could be fixed with a simple, additional `Set` which stores the hashCodes of the elements _(instead of searching through the entire list)_ - which would require all of the elements to properly implement hashCode(), of course but with helper frameworks like Lombok, this really is no problem ... its kind of trivial, actually. One could even optimize *that* solution with a red/black-tree for hashCodes ... small memory overhead for large performance gains; welcome to the world of cloud computing ;-) – specializt Jun 08 '21 at 07:37
6

Why not encapsulate a set with a list, sort like:

new ArrayList( new LinkedHashSet() )

This leaves the other implementation for someone who is a real master of Collections ;-)

Daniel Hiller
  • 3,415
  • 3
  • 23
  • 33
  • 6
    This constructor copies the contents of the Set into the new List, rather than wrapping it. – Calum Nov 06 '08 at 13:35
  • @Calum, that is correct, but instead of worrying about not adding duplicates to a List, he can add his objects to a Set (and let the Set worry about filtering out duplicates) and just wrap that Set in a List when passing it to the external method. – matt b Nov 06 '08 at 16:26
  • 5
    This copies a set to a list but you don't have any well-known ordering. But this what the question is all about. – Janning Vygen May 03 '12 at 12:32
5

You should seriously consider dhiller's answer:

  1. Instead of worrying about adding your objects to a duplicate-less List, add them to a Set (any implementation), which will by nature filter out the duplicates.
  2. When you need to call the method that requires a List, wrap it in a new ArrayList(set) (or a new LinkedList(set), whatever).

I think that the solution you posted with the NoDuplicatesList has some issues, mostly with the contains() method, plus your class does not handle checking for duplicates in the Collection passed to your addAll() method.

matt b
  • 138,234
  • 66
  • 282
  • 345
  • I'd love to learn of these contains() issues. As for the addAll(), I create a copy of the given collection and remove all objects already in 'this'. How does that not handle duplicates? – Yuval Nov 06 '08 at 17:21
  • As I mentioned in my comment to your class posting, contains() has to scan the entire list (in the worst case) to find if the object is contained in the list. If you have a list of 1 million items and add 10 it it individually, then (in the worst case) over ten million items are scanned. – matt b Nov 06 '08 at 18:45
  • As for addAll(), if the Collection passed to addAll contains duplicates itself, they are not detected. For example: your list {A, B, C, D} parameter list {B, D, E, E, E}. You create a copy of the parameter, and after removeAll it contains {E, E, E}. – matt b Nov 06 '08 at 18:46
  • The addAll() issue is not really relevant to me, as I use NoDuplicatesList throughout the procedure, and addAll() should be receiving another NoDuplicatesList as its parameter. What would you suggest to improve the contains() performance? – Yuval Nov 08 '08 at 21:17
3

I needed something like that, so I went to the commons collections and used the SetUniqueList, but when I ran some performance test, I found that it seems not optimized comparing to the case if I want to use a Set and obtain an Array using the Set.toArray() method.

The SetUniqueTest took 20:1 time to fill and then traverse 100,000 Strings comparing to the other implementation, which is a big deal difference.

So, if you worry about the performance, I recommend you to use the Set and Get an Array instead of using the SetUniqueList, unless you really need the logic of the SetUniqueList, then you'll need to check other solutions...

Testing code main method:

public static void main(String[] args) {


SetUniqueList pq = SetUniqueList.decorate(new ArrayList());
Set s = new TreeSet();

long t1 = 0L;
long t2 = 0L;
String t;


t1 = System.nanoTime();
for (int i = 0; i < 200000; i++) {
    pq.add("a" + Math.random());
}
while (!pq.isEmpty()) {
    t = (String) pq.remove(0);
}
t1 = System.nanoTime() - t1;

t2 = System.nanoTime();
for (int i = 0; i < 200000; i++) {
    s.add("a" + Math.random());
}

s.clear();
String[] d = (String[]) s.toArray(new String[0]);
s.clear();
for (int i = 0; i < d.length; i++) {
    t = d[i];

}
t2 = System.nanoTime() - t2;

System.out.println((double)t1/1000/1000/1000); //seconds
System.out.println((double)t2/1000/1000/1000); //seconds
System.out.println(((double) t1) / t2);        //comparing results

}

Regards, Mohammed Sleem

Grandtour
  • 1,127
  • 1
  • 11
  • 28
1

My lastest implementation: https://github.com/marcolopes/dma/blob/master/org.dma.java/src/org/dma/java/util/UniqueArrayList.java

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.LinkedHashSet;

/**
 * Extends <tt>ArrayList</tt> and guarantees no duplicate elements
 */
public class UniqueArrayList<T> extends ArrayList<T> {

    private static final long serialVersionUID = 1L;

    public UniqueArrayList(int initialCapacity) {
        super(initialCapacity);
    }

    public UniqueArrayList() {
        super();
    }

    public UniqueArrayList(T[] array) {
        this(Arrays.asList(array));
    }

    public UniqueArrayList(Collection<? extends T> col) {
        addAll(col);
    }


    @Override
    public void add(int index, T e) {
        if (!contains(e)) super.add(index, e);
    }

    @Override
    public boolean add(T e) {
        return contains(e) ? false : super.add(e);
    }

    @Override
    public boolean addAll(Collection<? extends T> col) {
        Collection set=new LinkedHashSet(this);
        set.addAll(col);
        clear();
        return super.addAll(set);
    }

    @Override
    public boolean addAll(int index, Collection<? extends T> col) {
        Collection set=new LinkedHashSet(subList(0, index));
        set.addAll(col);
        set.addAll(subList(index, size()));
        clear();
        return super.addAll(set);
    }

    @Override
    public T set(int index, T e) {
        return contains(e) ? null : super.set(index, e);
    }

    /** Ensures element.equals(o) */
    @Override
    public int indexOf(Object o) {
        int index=0;
        for(T element: this){
            if (element.equals(o)) return index;
            index++;
        }return -1;
    }


}
marcolopes
  • 9,232
  • 14
  • 54
  • 65
0

The documentation for collection interfaces says:

Set — a collection that cannot contain duplicate elements.
List — an ordered collection (sometimes called a sequence). Lists can contain duplicate elements.

So if you don't want duplicates, you probably shouldn't use a list.

naXa stands with Ukraine
  • 35,493
  • 19
  • 190
  • 259
Hauch
  • 41
  • 1
  • I specifically mentioned that I need a List implementation. Trust me, there's a reason. – Yuval Nov 06 '08 at 14:17
  • Is the reason because you're interacting with an API that is taking a List as a parameter (instead of a Collection)? That's a bit annoying to have to deal with – matt b Nov 06 '08 at 16:17
  • Actually the API takes a Map>>, which means holding somewhere in the vicinity of dozens to hundreds of lists... bah. – Yuval Nov 06 '08 at 17:26
  • Constructing probability functions with element-probability pairs can involve not have duplicates, although duplicate elements could just be merged. – Al G Johnston Apr 18 '20 at 13:15
0

Off the top of my head, lists allow duplicates. You could quickly implement a UniqueArrayList and override all the add / insert functions to check for contains() before you call the inherited methods. For personal use, you could only implement the add method you use, and override the others to throw an exception in case future programmers try to use the list in a different manner.

naXa stands with Ukraine
  • 35,493
  • 19
  • 190
  • 259
Kieveli
  • 10,944
  • 6
  • 56
  • 81
  • I was ready to fall back to this idea (which eventually I had to) if no one suggested anything better =8-) See my own answer above. – Yuval Nov 06 '08 at 17:23
-1

What about this? Just check the list before adding with a contains for an already existing object

while (searchResult != null && searchResult.hasMore()) {
    SearchResult nextElement = searchResult.nextElement();
    Attributes attributes = nextElement.getAttributes();

    String stringName = getAttributeStringValue(attributes, SearchAttribute.*attributeName*);
   
   if(!List.contains(stringName)){
    List.add(stringName);
   }
}
David Buck
  • 3,752
  • 35
  • 31
  • 35
GerNi
  • 1
-1

in add method, why not using HashSet.add() to check duplicates instead of HashSet.consist(). HashSet.add() will return true if no duplicate and false otherwise.

naXa stands with Ukraine
  • 35,493
  • 19
  • 190
  • 259
-3

I just made my own UniqueList in my own little library like this:

package com.bprog.collections;//my own little set of useful utilities and classes

import java.util.HashSet;
import java.util.ArrayList;
import java.util.List;
/**
*
* @author Jonathan
*/
public class UniqueList {

private HashSet masterSet = new HashSet();
private ArrayList growableUniques;
private Object[] returnable;

public UniqueList() {
    growableUniques = new ArrayList();
}

public UniqueList(int size) {
    growableUniques = new ArrayList(size);
}

public void add(Object thing) {
    if (!masterSet.contains(thing)) {
        masterSet.add(thing);
        growableUniques.add(thing);
    }
}

/**
 * Casts to an ArrayList of unique values
 * @return 
 */
public List getList(){
    return growableUniques;
}

public Object get(int index) {
    return growableUniques.get(index);
}

public Object[] toObjectArray() {
    int size = growableUniques.size();
    returnable = new Object[size];
    for (int i = 0; i < size; i++) {
        returnable[i] = growableUniques.get(i);
    }
    return returnable;
    }
}

I have a TestCollections class that looks like this:

package com.bprog.collections;
import com.bprog.out.Out;
/**
*
* @author Jonathan
*/
public class TestCollections {
    public static void main(String[] args){
        UniqueList ul = new UniqueList();
        ul.add("Test");
        ul.add("Test");
        ul.add("Not a copy");
        ul.add("Test"); 
        //should only contain two things
        Object[] content = ul.toObjectArray();
        Out.pl("Array Content",content);
    }
}

Works fine. All it does is it adds to a set if it does not have it already and there's an Arraylist that is returnable, as well as an object array.