115

I need to combine two string sets while filtering out redundant information, this is the solution I came up with, is there a better way that anyone can suggest? Perhaps something built in that I overlooked? Didn't have any luck with google.

Set<String> oldStringSet = getOldStringSet();
Set<String> newStringSet = getNewStringSet();

for(String currentString : oldStringSet)
{
    if (!newStringSet.contains(currentString))
    {
        newStringSet.add(currentString);
    }
}
Phonolog
  • 6,321
  • 3
  • 36
  • 64
FooBar
  • 1,663
  • 2
  • 12
  • 19

12 Answers12

138

Since a Set does not contain duplicate entries, you can therefore combine the two by:

newStringSet.addAll(oldStringSet);

It does not matter if you add things twice, the set will only contain the element once... e.g it's no need to check using contains method.

icedtrees
  • 6,134
  • 5
  • 25
  • 35
dacwe
  • 43,066
  • 12
  • 116
  • 140
134

You can do it using this one-liner

Set<String> combined = Stream.concat(newStringSet.stream(), oldStringSet.stream())
        .collect(Collectors.toSet());

With a static import it looks even nicer

Set<String> combined = concat(newStringSet.stream(), oldStringSet.stream())
        .collect(toSet());

Another way is to use flatMap method:

Set<String> combined = Stream.of(newStringSet, oldStringSet).flatMap(Set::stream)
        .collect(toSet());

Also any collection could easily be combined with a single element

Set<String> combined = concat(newStringSet.stream(), Stream.of(singleValue))
        .collect(toSet());
ytterrr
  • 3,036
  • 6
  • 23
  • 32
  • 1
    how is this better than addAll? – KKlalala Feb 01 '18 at 22:11
  • 15
    @KKlalala, your requirements will determine which is better. The main difference between `addAll` and using Streams is: • using `set1.addAll(set2)` will have the side effect of physically changing the contents of `set1`. • However, using Streams will always result in a new instance of `Set` containing the contents of both sets without modifying either of the original Set instances. IMHO this answer is better because it avoids side-effects and potential for unexpected changes to the original set if it were to be used elsewhere whilst expecting the original contents. HTH – edwardsmatt Jun 26 '18 at 02:53
  • 3
    This also has the advantage of supporting Immutable Sets. See: https://docs.oracle.com/javase/8/docs/api/java/util/Collections.html#unmodifiableSet-java.util.Set- – edwardsmatt Jun 26 '18 at 03:12
  • A word of advice, if you collect to something else than `Set`, add `.distinct()`, because the stream can have duplicates – Paweł Prażak Mar 30 '22 at 17:08
46

The same with Guava:

Set<String> combinedSet = Sets.union(oldStringSet, newStringSet)
Pimgd
  • 5,983
  • 1
  • 30
  • 45
Alexander Pranko
  • 1,859
  • 17
  • 20
14

From the definition Set contain only unique elements.

Set<String> distinct = new HashSet<String>(); 
 distinct.addAll(oldStringSet);
 distinct.addAll(newStringSet);

To enhance your code you may create a generic method for that

public static <T> Set<T> distinct(Collection<T>... lists) {
    Set<T> distinct = new HashSet<T>();

    for(Collection<T> list : lists) {
        distinct.addAll(list);
    }
    return distinct;
}
10

If you are using Guava you can also use a builder to get more flexibility:

ImmutableSet.<String>builder().addAll(someSet)
                              .addAll(anotherSet)
                              .add("A single string")
                              .build();
7

If you are using the Apache Common, use SetUtils class from org.apache.commons.collections4.SetUtils;

SetUtils.union(setA, setB);
Vinit Solanki
  • 1,863
  • 2
  • 15
  • 29
4

Just use newStringSet.addAll(oldStringSet). No need to check for duplicates as the Set implementation does this already.

tobiasbayer
  • 10,269
  • 4
  • 46
  • 64
3

If you care about performance, and if you don't need to keep your two sets and one of them can be huge, I would suggest to check which set is the largest and add the elements from the smallest.

Set<String> newStringSet = getNewStringSet();
Set<String> oldStringSet = getOldStringSet();

Set<String> myResult;
if(oldStringSet.size() > newStringSet.size()){
    oldStringSet.addAll(newStringSet);
    myResult = oldStringSet;
} else{
    newStringSet.addAll(oldStringSet);
    myResult = newStringSet;
}

In this way, if your new set has 10 elements and your old set has 100 000, you only do 10 operations instead of 100 000.

Ricola
  • 2,621
  • 12
  • 22
  • This is a very good logic that I can not imagine why this is not in the main addAll method parametter, like `public boolean addAll(int index, Collection extends E> c, boolean checkSizes)` – Gaspar Apr 26 '19 at 14:46
  • I guess because of the specification itself : *Adds all of the elements in the specified collection to this collection*. You could have another method indeed but it would be quite confusing if it doesn't follow the same specification than the methods it overloads. – Ricola Apr 26 '19 at 22:23
  • Yes, I was saying other method overloading that one – Gaspar Apr 29 '19 at 11:22
3

http://docs.oracle.com/javase/7/docs/api/java/util/Set.html#addAll(java.util.Collection)

Since sets can't have duplicates, just adding all the elements of one to the other generates the correct union of the two.

Viruzzo
  • 3,025
  • 13
  • 13
3
 newStringSet.addAll(oldStringSet);

This will produce Union of s1 and s2

Kushan
  • 10,657
  • 4
  • 37
  • 41
2
Set.addAll()

Adds all of the elements in the specified collection to this set if they're not already present (optional operation). If the specified collection is also a set, the addAll operation effectively modifies this set so that its value is the union of the two sets

newStringSet.addAll(oldStringSet)
HK boy
  • 1,398
  • 11
  • 17
  • 25
UmNyobe
  • 22,539
  • 9
  • 61
  • 90
1

You can use stream from Java8 and receive new set

Stream.of(set1, set2)
                .flatMap(Set::stream)
                .collect(Collectors.toSet())
RazvanParautiu
  • 2,805
  • 2
  • 18
  • 21