3

I'm using a LinkedHashSet to implement a set ordered by entry. If a duplicate element is added, it should go at the end and the entry already in the set should be removed.

I'm debating whether it's worth it to check set.contains(entry) before I call set.remove(entry)? Essentially does checking contains() each time offer any time/space benefit over just blindly calling remove() each time?

Set<Integer> set = new LinkedHashSet<>();

// The below
set.remove(entry);
set.add(entry);

// Versus the below
if (set.contains(entry)) {
  set.remove(entry);
}
set.add(entry);
M. Justin
  • 14,487
  • 7
  • 91
  • 130
Fred
  • 165
  • 2
  • 7
  • 1
    No its not worth to check, brings in check-then-act problems alongwith it – Nitin Dandriyal May 27 '15 at 02:58
  • @NitinDandriyal it depends if the `Set` may be reused in several threads or not. – Luiggi Mendoza May 27 '15 at 02:59
  • 1
    @LuiggiMendoza Changing comment to - No its not worth to check as its redundant, also brings in check-then-act problems along with it in case of when multiple threads access the set. – Nitin Dandriyal May 27 '15 at 03:06
  • For concurrency, you can always use `Collections.synchronizedSet(set);` to get a thread safe set from an existing one. – augray May 27 '15 at 03:19
  • @LuiggiMendoza It is never worthwhile to check, and, if there are multiple threads, checking will make it *worse,* by introducing timing-window errors. `Set.remove()` already has to check, obviously, and it even returns a `boolean` telling you whether it was there or not. – user207421 Aug 13 '23 at 04:21
  • I would *guess* that: if `contains` is called, the element must be searched once; when then `remove` is called, the element must be searched a second time; and when `add` is called, it must be searched a third time. (sure the `Hash` part *helps* a bit - no full scan needed) – user16320675 Aug 13 '23 at 06:58

3 Answers3

1

I would recommend against calling both contains and remove. Both calls have to search through the chains to find the element, and contains does this no faster than remove. So you are potentially doubling your required time for a call each time.

You can browse through the code for contains and remove yourself.

You'll find that both end up possibly iterating through the hash chains at some point (remove in HashMap's removeNode and contains in HashMap's getNode), which could be a performance hit if your set is under heavy load.

If your set isn't under heavy load, then it still probably isn't worth it, but doesn't matter so much since in a sparse hash chains set the expected lookup/remove is O(1) (i.e. will be just as fast regardless of whether you have 5 elements or 5000 in it).

augray
  • 3,043
  • 1
  • 17
  • 30
1

LinkedHashSet uses HashMap behind the scenes remove(Object) will remove the Object only if it contains the Object hence its unnecessary and will increase time complexity in case Object exists in set.

Nitin Dandriyal
  • 1,559
  • 11
  • 20
  • `LinkedHashSet` is not thread safe so should not be used for multithreaded access if you're modifying it in the first place. – Raniz May 27 '15 at 03:15
  • @Raniz That doesn't mean that you can not use `LinkedHashSet` while multiple threads are supposed to use it – Nitin Dandriyal May 27 '15 at 03:16
  • Only if you *never* modify it. Calling `add` from multiple threads simultaneously is bound to mess up the internal state of the underlying `HashMap` eventually. – Raniz May 27 '15 at 03:19
  • exactly that's why synchronization techniques are useful, even for modifying. Have you ever built a simple cache using `LinkeedHashSet` if yes you'll understand – Nitin Dandriyal May 27 '15 at 03:25
  • But if you're already synchronising on your object you won't have any "check-then-act" issues - unless you do them in separate synchronised blocks of course which would just be outright stupid. – Raniz May 27 '15 at 03:33
  • @Raniz If by synchronizing on object you mean synchronizing on `this`, its almost never a good idea, exposes to client side locking, and how are synchronized blocks stupid(how do you even make such perceptions)? – Nitin Dandriyal May 27 '15 at 03:37
  • I did not advocate any synchronisation method at all, especially not on `this` - you're the one who brought up synchronisation. I also said that *separate synchronised blocks* - as in: gain exclusive access; check; leave exclusive access; gain exclusive access again; modify; leave exclusive access; - were stupid. Read my comment again. – Raniz May 27 '15 at 03:47
  • 1
    Vague discussion without any direction, let's close this here. Editing my answer to remove multi-threaded part, anyways doesn't contribute to answer much. – Nitin Dandriyal May 27 '15 at 03:52
0

Java 21 is adding an addLast method to LinkedHashSet, which will move the element to the end of the set if it is not already present.

Adds an element as the last element of this collection (optional operation). After this operation completes normally, the given element will be a member of this collection, and it will be the last element in encounter order.

Using this method instead of add solves the stated requirement without having to also call remove.

Set<Integer> set = new LinkedHashSet<>();

set.addLast(entry);
M. Justin
  • 14,487
  • 7
  • 91
  • 130