0

I have been putting my entire index into memory using RAMDirectory to improve performance and it worked beautifully until my index grew and grew. Now I am getting OutOfMemoryException. While my index on disk is 1.24GB, I suspect that the object size of the RAMDirectory object ends up exceeding the .NET 2GB object size limit and the exception is therefore thrown. Another reason might be that the virtual address space is simply too fragmented to find a hole big enough for my object.

I would love to continue using RAMDirectory. How can I do that while avoiding the OutOfMemoryException?

Please also note that when I write my index I call IndexWriter.Optimize so the entire index is in one big file.

Barka
  • 8,764
  • 15
  • 64
  • 91
  • 2
    I think you will see minimal speed improvement of using the RAMDirectory for searching. I would try it without it and I bet you will see that the performance is fairly equivelent once you warm the index. – John Sobolewski Dec 05 '11 at 20:58
  • thanks! what does it take to warm the index? – Barka Dec 05 '11 at 21:01
  • Switch to a 64-bit operating system. – Hans Passant Dec 05 '11 at 21:05
  • thanks @Hans, I am on a 64 bit OS – Barka Dec 05 '11 at 21:11
  • That can't be right. Did you set the Platform target property of your main EXE project to AnyCPU? – Hans Passant Dec 05 '11 at 21:12
  • @HansPassant I just checked and it is AnyCPU. My understanding is that the 2GB object size limit exists in the 64-bit world also. Am I wrong? – Barka Dec 05 '11 at 21:25
  • @HansPassant just for kicks, I changed the Platform target property to x64 and got a System.BadImageFormatException. – Barka Dec 05 '11 at 21:32
  • I think this is a Lucene.NET limitation, see the thread here http://grokbase.com/p/lucene.apache.org/lucene-net-user/2007/01/re-2gb-filesize-ramdirectory/22izms4xtkmmbfblwmjcrrxxqlai. But it's an old thread so I don't know if it's still valid (although your issue suggests it is) – Matt Warren Dec 06 '11 at 10:51

1 Answers1

2

The only way to keep on using a RAMDirectory I can think of is to split it in several smaller indexes and use a MultiSearcher.

This way you'll be able to avoid the .NET 2GB object size limit, note that even on 64bit a single object still has a size limit of 2GB, RamDirectory holds an array of bytes internally to represent the index and thats probably whats making it blow up if its too big.

In my humble opinion tho, you should probably think about using a FSDirectory with large indexes, the speed is usually good enough for most applications after its warmed up.

Jf Beaulac
  • 5,206
  • 1
  • 25
  • 46
  • Thanks! It seems like the performance gained by using an in-memory object will be offset by the performance loss by using MultiSearcher. – Barka Dec 07 '11 at 23:15
  • just run a couple of typical queries when you re-open the searcher so it can build up its caches. The way I do this is when I re-open the searcher I keep the old one opened to keep serving queries and start warming up the new one. When its warmed up, I replace the old one with the new one (basically changing the value of a variable) and close the old searcher after the swap is done. – Jf Beaulac Dec 08 '11 at 15:26
  • 1
    many thanks! It seems that the correct answer is to use FSDirectory since there is no good way to use RAMDirectory if index size reaches 2G. – Barka Dec 12 '11 at 05:24