0

I have a memory leak in my solr 3.6.2 based app that holds its index on NFS. I've read that it's not best practice to hold solr index on NFS, but I got this system by heritage and I can't change it at the moment.

Once in every X (configurable) minutes, the system "wakes up" and starts to index some data it collected during the sleeping time. It reads from files and the solr indexes it. The problem is that the memory of the solr app gets larger and larger every indexing cycle. Eventually it crashes with OOM, after several indexing cycles.

gclog graph

I took heapdump several times and tried to analyze it using Eclipse Memory Analyzer. The "Remainder" section is very large

and it seems that it consists of many org.apache.lucene.index.FreqProxTermsWriterPerField.FreqProxPostingsArray unreachable objects: dominator_tree

Is this the source of the leak or is this normal?

Another strange behavior is that more and more .nfs00000000000..... hidden files are being generated in the index directory (which is located on an NFS, as I noted before) with every indexing cycle.

These files take a lot of disk space, and they disappear when I kill the solr app. I suspect it has something to do with the leak. Does the app somehow holds those files and doesn't release them as long as it's alive? Is it the reason for the inevitable OOM error? How can I avoid having more and more of those files?

Thanks!

  • The "Remainder" section: https://i.stack.imgur.com/BQFjQ.jpg – Arie Shterengartz Aug 14 '17 at 15:43
  • Please let me know if any more details are needed. – Arie Shterengartz Aug 14 '17 at 15:44
  • The answer for the `.nfs` files [can be found on ServerFault](https://serverfault.com/questions/201294/nfsxxxx-files-appearing-what-are-those). Other than that - 3.6.2 is waaaaay to old to say anything useful. The first thing with memory leaks are to always upgrade to a more recent version. My guess is that those writers gets into a lock with NFS and therefor never gets freed, since they still keep the files open, and since the files are still open, nfs can't really do anything about them and keeps them as `.nfs` files. There might be a reason why NFS isn't suggested for Lucene and Solr :-) – MatsLindh Aug 14 '17 at 18:29
  • Is there a way to see in the eclipse memory analyzer if the memory leaked is in deed caused by those files? – Arie Shterengartz Aug 15 '17 at 13:37
  • I'm not familiar with the Eclipse Memory Analyzer, but if you dump the current stack, you should be able to peek at the internal values for the objects .. but my guess is that NFS itself is maintaining that mapping from a different file name, so it wouldn't really be visible. A simple test would be to stop the server, move the index to a local disk, then test it there to see if you still get the same memory leak. – MatsLindh Aug 15 '17 at 14:53
  • Thanks MatsLindh, I tried to put the index on a regular disk - non-NFS, but it still does the same problem - the ".nfsXXX..." files are still being created in the local new index directory and the leak is not solved. I'm really clueless.. – Arie Shterengartz Aug 23 '17 at 07:56
  • That sounds weird, since if NFS isn't involved, those files should be created as far as I can tell. Is anything else read from NFS? (i.e. configuration, etc.) – MatsLindh Aug 23 '17 at 08:11
  • Sorry, now I see that it's also an NFS. used the command to check (https://stackoverflow.com/questions/460047/how-do-i-determine-if-a-directory-is-an-nfs-mount-point-in-shellscript): stat -f -L -c %T /data Unfortunately, I don't have a real local disk avail for this app :( will have to find the root cause of those files created (and the leak). Will it help if I attach here the code of the indexing process to my solr? – Arie Shterengartz Aug 23 '17 at 08:18
  • I'm fairly sure that won't help, but it shouldn't hurt - as long as you're committing, it should work. Does the files get released if you issue an optimize of the index, where all the segments are rewritten and old segments are closed? – MatsLindh Aug 23 '17 at 08:24

0 Answers0