Few days ago I had a struggle with a strange bug, that occurred in my map reduce task.
Finally, it turned out that hadoop ValueIterable
class that implements Iterable
interface creates a single instance of iterator and returns it on every call of iterator()
method.
protected class ValueIterable implements Iterable<VALUEIN> {
private ValueIterator iterator = new ValueIterator();
@Override
public Iterator<VALUEIN> iterator() {
return iterator;
}
}
That means if you iterate over ValueIterable
once, you are not able to iterate it again.
I decided to check java documentation and seems that it does not require Iterable
to return different iterators every time (or just missing the requirement?). Diving deeper I found this answer telling that having a single iterator violates Iterator
contract, since it can not traverse the collection more than once.
Who is correct here? Should Iterable return new iterators? Why are java docs unclear?
What would be the correct way for this hadoop class to tell client that traverse is impossible? I mean if it will throw
IllegalStateException
, would it violateIterator#hasNext()
method contract?