14

More specifically: how to get the nth element of a LinkedHashSet (which has a predictable iteration order)? I want to retrieve the nth element inserted into this Set (which wasn't already present).

Is it better to use a List:

List<T> list = new ArrayList<T>(mySet);
T value = list.get(x); // x < mySet.size()

or the toArray(T [] a) method:

T [] array = mySet.toArray(new T[mySet.size()]);
T value = array[y]; // y < mySet.size()

Other than the (likely slight) performance differences, anything to watch out for? Any clear winner?

Edit 1

NB: It doesn't matter why I want the last-inserted element, all that matters is that I want it. LinkedHashSet was specifically chosen because it "defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order). Note that insertion order is not affected if an element is re-inserted into the set."

Edit 2

This question seems to have devolved into a discussion of whether any Set implementation can ever preserve original insertion order. So I put up some simple test code at http://pastebin.com/KZJ3ETx9 to show that yes, LinkedHashSet does indeed preserve insertion order (the same as its iteration order) as its Javadoc claims.

Edit 3

Modified the description of the problem so that everybody isn't too focused on retrieving the last element of the Set (I originally thought that the title of the question would be enough of a hint — obviously I was wrong).

markvgti
  • 4,321
  • 7
  • 40
  • 62
  • 1
    Why did you use a `Set` in the first place if insertion order is important? And what is the meaning of "last element inserted" if the element was already present in the set? – Henry Jun 28 '14 at 06:38
  • Obviously using a `Set` to ensure no duplicates. If I insert 3 elements into a `Set`, all unique, then the 3rd element is the "last element inserted". Think of it as the "freshest" `Set` member: I want to use the "freshest" value. – markvgti Jun 28 '14 at 06:40
  • 1
    @markvgti there is no such thing as the "freshest" set... A set has no order! – Nir Alfasi Jun 28 '14 at 06:42
  • @alfasin It doesn't matter what Math says, what matters is which element I want to access. LinkedHasSet (API docs linked to in the question) preserves insertion order -- I want to access the last-inserted element. The why doesn't matter. – markvgti Jun 28 '14 at 06:45
  • @alfasin This particular implementation of `Set` does in fact guarantee iteration order. From the javadoc: "Hash table and linked list implementation of the Set interface, with predictable iteration order. This implementation differs from HashSet in that it maintains a doubly-linked list running through all of its entries. This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order)." – awksp Jun 28 '14 at 06:50
  • @alfasin So I guess you're saying the implementers of LinkedHashSet who wrote the Javadoc are wrong? – markvgti Jun 28 '14 at 06:53
  • @user3580294 like you wrote: it guarantees *iteration* order, not *INSERTION* order. See HelpVampire666's answer why it's not the same thing. – Nir Alfasi Jun 28 '14 at 06:58
  • 1
    @alfasin "This linked list defines the iteration ordering, **which is the order in which elements were inserted into the set (insertion-order).**" The iteration order **is** insertion order. – awksp Jun 28 '14 at 06:58
  • 1
    @alfasin It's a *Set*. You *can't* "re-insert" an element into it because it doesn't contain duplicates. Once an element is there, it's there. "Re-inserting" an element wouldn't make sense, and because it wouldn't change a `Set`, shouldn't change anything else either. The side effects would be too difficult to deal with. – awksp Jun 28 '14 at 07:00
  • @user3580294 now why is that ? because: SET has no order! – Nir Alfasi Jun 28 '14 at 07:01
  • 3
    @alfasin And yet there's a set implementation *with* an order -- and that implementation is `LinkedHashSet`. `Set` makes no guarantees to order, and *doesn't prohibit order either*. So it's perfectly OK for an implementation of a `Set` to have an order. – awksp Jun 28 '14 at 07:02
  • 1
    @alfasin So there's no way we can resolve this if we see differently. I chose the first approach because I see the insertions as a sequence of insertions *to a set*, and if the insertion to the set failed there's no reason you should track it. To me, keeping track of the insertions separately from the set itself is illogical because by definition you're tracking insertions to the set. Can't say that the equivalent behavior for `LinkedHashMap` makes sense, but it is logical to me for `LinkedHashSet`. – awksp Jun 28 '14 at 07:08
  • @user3580294 when you try to insert the same element to a set, you don't get any exception thrown at you, guess why ? (hint: it's a valid action and there is no "failure" here). – Nir Alfasi Jun 28 '14 at 07:12
  • @alfasin No, you don't get an exception -- although if the implementers had decided to do so you certainly could get one (that would be a horrible misuse of exceptions, though). But you get a return value of `false` -- as in the addition failed. Yes, it's a valid action. No, it shouldn't change anything about the collection if the action wouldn't change anything for any other `Set` implementation. – awksp Jun 28 '14 at 07:13
  • @user3580294 the return value is a fair claim! (Not that I agree with you because of it :) I guess it's open to interpretation if the "insert-sequence" should include repeating elements. – Nir Alfasi Jun 28 '14 at 07:19
  • @alfasin Yeah, don't think we can really get further unless we can get Josh Bloch here to describe the design decisions. I guess we can at least be happy that `LinkedHashSet` and `LinkedHashMap` are consistent in their behavior. – awksp Jun 28 '14 at 07:21
  • 1
    @user3580294 it was an interesting discussion though :) – Nir Alfasi Jun 28 '14 at 07:22
  • @alfasin Your argumentation is pointless as the example test I have posted clearly shows that a) LinkedHashSet's idea of insertion ordering matches with my own and b) it does what I want it to do. Your idea of insertion order may be different (which doesn't make it any less valid) but it's pointless to argue what *should be* vs. *what is*. – markvgti Jun 28 '14 at 07:23
  • @alfasin Agreed. I'd be nice to ask a question clarifying this, although I'm not sure it'd be closed due to no one being able to provide an authoritative answer outside of the Collections authors. But I'm pretty sure I've seen some Java language architects around, maybe Bloch is among them... Or maybe someone is enlightened enough to answer. Who knows... – awksp Jun 28 '14 at 07:25
  • @markvgti you obviously didn't follow the conversation here. The example you provided doesn't handle cases of multiple insertion of the same item. From the link that you posted to LinkedHashSet docs: "Note that insertion order is not affected if an element is re-inserted into the set." That said, if you're happy with it (and aware of the limitation) - go for it. – Nir Alfasi Jun 28 '14 at 07:26
  • @alfasin Are you in the habit of writing before reading fully? http://pastebin.com/KZJ3ETx9 clearly inserts the number 3 multiple times!!! Yet the ordering of elements is utterly predictable. – markvgti Jun 28 '14 at 07:30
  • @markvgti add `System.out.println(list.get(2).intValue());` it'll print `3` when it's obviously the LAST item to be inserted (your example doesn't test *that*). – Nir Alfasi Jun 28 '14 at 07:36
  • @alfasin It isn't the last item to be inserted, it's the 3rd (remember, duplicate insertions don't count --- that's the whole point of a Set). 8 is the last item to actually be inserted into the Set. So `list.get(2).intValue()` **should return** 3, which is what it does. I understand we have a difference of opinion as to what constitutes an "insertion" (luckily for me, my understanding is the same as that of the writers of `LinkedHashSet`), but this discussion (and down-voting of the question) is fruitless (and counter-productive) as it doesn't get us any closer to an answer to my question. – markvgti Jun 28 '14 at 07:45
  • @user3580294 had to write one more thing... the [*docs*](http://docs.oracle.com/javase/7/docs/api/java/util/Set.html#add(E)) state that the return value only shows "if this set did not already contain the specified element" - it doesn't mean that the insertion failed. Just sayin :) – Nir Alfasi Jun 28 '14 at 07:52
  • @alfasin And it also says "If this set already contains the element, the call leaves the set unchanged and returns false". So I suppose it depends on what you mean by "failed"... On a side note, it also says "sets may refuse to add any particular element, including null, and throw an exception". So it seems exceptions are actually allowed... – awksp Jun 28 '14 at 08:00
  • @user3580294 yeps, 4 types of exceptions are thrown, none of which for duplicates, which just stresses the point that order is meaningless :D – Nir Alfasi Jun 28 '14 at 08:02
  • Am I the only one who is confused by the mixed use of "nth element" (which, for me at least, suggests an *indexed access* with index `n`), and the use of "last element" (which is trivial to accomplish using the approach of HelpVampire666 or omu_negru)? So if you need indexed access, Set is the wrong data structure, and you should consider a different approach (maybe storing the data in a Set **and** in a list, cleverly connected to avoid duplicates). If you need the last element, use one of the answers of the aforementioned users. – Marco13 Jun 28 '14 at 09:46
  • @Marco13 That's why I have modified the question to remove all mention of "last element". nth element is the general case and that's what I am more interested in. Obviously the `Set` is an easy way to ensure there are no duplicates. I want to avoid the overhead of ensuring that a `Set` and a `List` stay in sync. – markvgti Jun 28 '14 at 11:47
  • For the general case, there is a tradeoff. You can have O(1), O(logn) or O(n) for the most common operations (insert, access, ...). Depending on *which* operations you have to perform (and which are time-critical), you should use the appropriate data structure. Another tradeoff is the one between memory consumption and certain operations. You could easily accomplish O(1) insertion *and* indexed access when you use a Set+List, storing the data twice. In any case, in order to give an appropriate answer, the requirements must be clearly defined on this technical level. – Marco13 Jun 28 '14 at 13:50
  • 1
    @alfasin What do you mean, "order is meaningless"? If order is meaningless, then why bother make a `Set` implementation with order? (Oh shoot.... Is this going back to the beginning of our conversation again?) – awksp Jun 28 '14 at 18:31
  • 1
    @user3580294 :D I like you dude! :))) – Nir Alfasi Jun 28 '14 at 18:48

7 Answers7

6

This method is based on the updated requirement to return the nth element, rather than just the last element. If the source is e.g. a Set with identifier mySet, the last element can be selected by nthElement(mySet, mySet.size()-1).

If n is small compared to the size of the Set, this method may be faster than e.g. converting to an ArrayList.

  /**
   * Return an element selected by position in iteration order.
   * @param data The source from which an element is to be selected
   * @param n The index of the required element. If it is not in the 
   * range of elements of the iterable, the method returns null.
   * @return The selected element.
   */
  public static final <T> T nthElement(Iterable<T> data, int n){
    int index = 0;
    for(T element : data){
      if(index == n){
        return element;
      }
      index++;
    }
    return null;
  }
Patricia Shanahan
  • 25,849
  • 4
  • 38
  • 75
3

I'd use the iterator of the LinkedHashSet if you want to retrieve the last element:

Iterator<T> it = linkedHashSet.iterator();
T value = null;

while (it.hasNext()) {
    value = it.next();
}

After the loop execution value will be referring to the last element.

Juvanis
  • 25,802
  • 5
  • 69
  • 87
  • 4
    You're not answering the question! he didn't ask "how to do it" he asked which way is more efficient – Nir Alfasi Jun 28 '14 at 06:43
  • 3
    @alfasin by answering with my own solution, i'm implicitly claiming this way is more efficient. – Juvanis Jun 28 '14 at 06:45
  • It's more efficient than what ? it's actually *not* more efficient than either of the options he provided both are O(n), and so is your solution. – Nir Alfasi Jun 28 '14 at 06:49
  • 2
    @alfasin it has the same asymptotic behaviour but it has a smaller constant factor than the other solutions, so clearly it is more efficient. – Henry Jun 28 '14 at 06:53
  • on the contrary, the constant factor of this solution is O(n) as well while in the other solutions it's O(1)... – Nir Alfasi Jun 28 '14 at 06:56
  • Both alternatives that the OP provides *also* iterate over all the elements of the set. The `ArrayList` constructor internally invokes `toArray()` and both `toArray()` and `toArray(Object[])` (implemented in `AbstractCollection` and not overridden) internally iterate over all the elements of the set. So this answer describes a way to do the same without the overhead of creating temporary arrays and `ArrayList` objects. So it's a least as fast (if HotSpot can eliminate the temporaries) or faster than both of the OP's suggestions. – Erwin Bolwidt Jun 28 '14 at 07:26
  • @ErwinBolwidt as I wrote on my second comment - all the solutions provided on this page are O(n). If you claim that one of them is faster you should prove your claim by benchmarking the three... – Nir Alfasi Jun 28 '14 at 07:40
  • 1
    No you don't. Please *read* my comment. Since the two methods suggested by the OP do **exactly** what @Juvanis does, plus more (allocate object and array) then you can prove that Juvanis solution does less work, which is a good definition of "more efficient".Btw you brought in the big-O notation, the OP never mentioned it. – Erwin Bolwidt Jun 28 '14 at 08:05
2

So I decided to go with a slight variation of the answer by @Juvanis.

To get at the nth element in a LinkedHashSet:

Iterator<T> itr = mySet.iterator();
int nth = y;
T value = null;

for(int i = 0; itr.hasNext(); i++) {
    value = itr.next();
    if (i == nth) {
        break;
    }
}

Version 2 of the code:

public class SetUtil {

    @Nullable
    public static <T> T nthElement(Set<T> set, int n) {
        if (null != set && n >= 0 && n < set.size()) {
            int count = 0;
            for (T element : set) {
                if (n == count)
                    return element;
                count++;
            }
        }
        return null;
    }
}

NB: with some slight modifications the method above can be used for all Iterables<T>.

This avoids the overhead of ensuring that a Set and a List stay in sync, and also avoids having to create a new List every time (which will be more time-consuming than any amount of algorithmic complexity).

Obviously I am using a Set to ensure uniqueness and I'd rather avoid a lengthy explanation as to why I need indexed access.

markvgti
  • 4,321
  • 7
  • 40
  • 62
1

You can go with below solution, here i have added object of ModelClass in HashSet.

ModelClass m1 = null;
int nth=scanner.nextInt();
for(int index=0;index<hashset1.size();index++){
    m1 = (ModelClass) itr.next();
    if(nth == index) {
        System.out.println(m1);
        break;
    }
}
Hardik Patel
  • 1,033
  • 1
  • 12
  • 16
1

A straightforward approach using streams:

mySet.stream().skip(x).findFirst()
        .orElseThrow(IndexOutOfBoundsException::new);
M. Justin
  • 14,487
  • 7
  • 91
  • 130
0

Set is unordered so the information on the last element inserted is lost. You cannot as such get the last element inserted. So don't use Set in the first place, or, if you really want to keep track of the last element, create a class containing that like this

class mySetAndLast extends Set{
   T last;
   Set<T> mySet;       
}

now the question is what is the 'last element inserted'. Imagine your set was empty

-> insert x -> ok, x is the last inserted 
-> insert y (y!=x) -> ok: y is the last inserted 
-> insert x -> ? 

is now x or y the last inserted? x does not get inserted because y was the last element inserted and x already is an element of the set, on the other hand x from the user's point of view was the last inserted..

  • 1
    From the javadoc for [LinkedHashSet](http://docs.oracle.com/javase/7/docs/api/java/util/LinkedHashSet.html): "defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order). Note that insertion order is not affected if an element is re-inserted into the set." There's a reason I am specifically using a `LinkedHashSet`. – markvgti Jun 28 '14 at 06:48
  • Where did the `T` come from ? And naming conventions ? Where is the `compareTo`, `equals`, `hashCode` methods ? Is extending a `Set` a right thing to do ? – bsd Jun 28 '14 at 06:49
  • 1
    This technically won't even compile because `Set` is an interface. The idea is right though. – awksp Jun 28 '14 at 06:52
  • And considering this is a `Set`, you only "insert" `x` *once*. If you try to add a duplicate element it will fail to add, period. That is the definition of a `Set`. Doesn't matter what "the user's point of view" is. Attempting to put something into a regular `HashSet` multiple times will only end up inserting it once too. There's a reason there's a return value on the `add()` method. – awksp Jun 28 '14 at 06:53
  • @HelpVampire666 This idea doesn't work if I want the 5th element or the 11th. The two methods I asked about work in all cases. I am simply asking which one is better (and why). – markvgti Jun 28 '14 at 06:57
0

For your own, internal purpose, you could "hack" your own Set from any List implementation:

public class ListSet<E> extends ArrayList<E> implements Set<E> {
    @Override
    public boolean add(E item) {
        return contains(item) ? false : super.add(item);
    }

    // ... and same for add(int, E), addAll(...), etc.
}

That example is slow (O(n) for an add) but, as you are the one implementing it, you can go back to it with smarter code for contains() based on your specifications.

Matthieu
  • 2,736
  • 4
  • 57
  • 87