3

My basic question is: what's the proper way to create/use instances of NIOFSDirectory and SimpleFSDirectory when there's multiple threads that need to make queries (reads) on the same index. More to the point: should an instance of the XXXFSDirectory be created for each thread that needs to do a query and retrieve some results (and then in the same thread have it closed immediatelly after), or should I make a "global" (singleton?) instance which is passed to all threads and then they all use it at the same time (and it's no longer up to each thread to close it when it's done with a query)?

Here's more details:

I've read the docs on both NIOFSDirectory and SimpleFSDirectory and what I got is:

  • they both support multithreading:

NIOFSDirectory : "An FSDirectory implementation that uses java.nio's FileChannel's positional read, which allows multiple threads to read from the same file without synchronizing."

SimpleFSDirectory : "A straightforward implementation of FSDirectory using java.io.RandomAccessFile. However, this class has poor concurrent performance (multiple threads will bottleneck) as it synchronizes when multiple threads read from the same file. It's usually better to use NIOFSDirectory or MMapDirectory instead."

  • NIOFSDirectory is better suited (basically, faster) than SimpleFSDirectory in a multi threaded context (see above)

  • NIOFSDIrectory does not work well on Windows. On Windows SimpleFSDirectory is recomended. However on *nix OS NIOFSDIrectory works fine, and due to better performance when multi threading, it's recommended over SimpleFSDirectory.

"NOTE: NIOFSDirectory is not recommended on Windows because of a bug in how FileChannel.read is implemented in Sun's JRE. Inside of the implementation the position is apparently synchronized."

The reason I'm asking this is that I've seen some actual projects, where the target OS is Linux, NIOFSDirectory is used to read from the index, but an instance of it is created for each request (from each thread), and once the query is done and the results returned, the thread closes that instance (only to create a new one at the next request, etc). So I was wondering if this is really a better approach than to simply have a single NIOFSDirectory instance shared by all threads, and simply have it opened when the application starts, and closed much later on when a certain (multi threaded) job is finished...

More to the point, for a web application, isn't it better to have something like a context listener which creates an instance of NIOFSDirectory , places it in to the Application Context, all Servlets share and use it, and then the same context listener closes it when the app shuts down?

Shivan Dragon
  • 15,004
  • 9
  • 62
  • 103

1 Answers1

2

Official Lucene FAQ suggests the following:

Share a single IndexSearcher across queries and across threads in your application.

IndexSearcher requires single IndexReader and the latter can be produced with a DirectoryReader.open(Directory) which would only require a single instance of Directory.

mindas
  • 26,463
  • 15
  • 97
  • 154
  • 1
    Thanks, I've missed that piece of information! Any idea if there might be bad side effects if the IndexReader is left open for a long time (like, days)? Should there be some "management thread" or something to close and re-open it periodically? – Shivan Dragon Feb 18 '13 at 15:42
  • 1
    If the reader is kept open and constantly used, then it's fine (we use it like that, no problems whatsoever). We only reload reader(s) if any changes are made. But if you open readers and don't close them, this can lead to *too many open files* exception. – mindas Feb 18 '13 at 15:47
  • 1
    Ah, ok, so if modifications are made AFTER a reader is opened, that reader will not see the changes? you need to close it and create a new reader for the changes to be taken into account? – Shivan Dragon Feb 19 '13 at 09:13
  • 1
    Yes, you are correct. You might call `.reopen()` though. Also read this: http://blog.mikemccandless.com/2011/09/lucenes-searchermanager-simplifies.html – mindas Feb 19 '13 at 09:36