I would like to have an iterator that can be read by multiple threads concurrently so that I can process the data of the iterator's source in parallel. The challenge is that I can't really couple hasNext()
with its logical next()
as those could go to different threads. (That is, two threads can call hasNext()
, each see true, and then have the second thread fail because there was only one item.) My problem is that for some sources I don't really know if it has a next element until I try to read it. One such example is reading lines from a file; another is reading Term
instances from a Lucene index.
I was thinking of setting up a queue inside the iterator and feeding the queue with a separate thread. That way, hasNext()
is implemented in terms of the queue size. But I don't see how I could guarantee that the queue is filled because that thread could get starved.
Should I ignore the Iterator contract, and just call next()
exhaustively until a NoSuchElementException
is thrown?
Is there a more elegant way of handling the problem?