1

I am using the Solr Suggester component in Solr 5.5 with a lot of address data. My Machine has allotted 20Gb RAM for solr and the machine has 32GB RAM in total.

I have an address book core with the following vitals -

"numDocs"=153242074
"segmentCount"=34
"size"=30.29 GB

My solrconfig.xml looks something like this -

<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
  <str name="name">mySuggester1</str>
  <str name="lookupImpl">FuzzyLookupFactory</str>
  <str name="storeDir">suggester_fuzzy_dir</str>

  <!-- Substitute these for the two above for another "flavor"
    <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
    <str name=?indexPath?>suggester_infix_dir</str>
  -->

  <str name="dictionaryImpl">DocumentDictionaryFactory</str>
  <str name="field">site_address</str>
  <str name="suggestAnalyzerFieldType">suggestType</str>
  <str name="payloadField">property_metadata</str>
  <str name="buildOnStartup">false</str>
  <str name="buildOnCommit">false</str>
</lst>
<lst name="suggester">
  <str name="name">mySuggester2</str>
  <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
  <str name="indexPath">suggester_infix_dir</str>

  <str name="dictionaryImpl">DocumentDictionaryFactory</str>
  <str name="field">site_address_other</str>
  <str name="suggestAnalyzerFieldType">suggestType</str>
  <str name="payloadField">property_metadata</str>
  <str name="buildOnStartup">false</str>
  <str name="buildOnCommit">false</str>
</lst>
</searchComponent>

The handler is defined like so -

<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy" >
<lst name="defaults">
  <str name="suggest">true</str>
  <str name="suggest.count">10</str>
  <str name="suggest.dictionary">mySuggester1</str>
  <str name="suggest.dictionary">mySuggester2</str>
  <str name="suggest.collate">false</str>
  <str name="echoParams">explicit</str>
</lst>
<arr name="components">
  <str>suggest</str>
</arr>
</requestHandler>

Problem Statement

Every time I try to build the suggest index using the suggest.build=true url parameter, I end up with an OutOfMemory error. I have no clue how I can make this work with the current setup. Can anyone explain why this is happening? And how can I fix this issue?

Qedrix
  • 453
  • 1
  • 8
  • 15
  • People have [reported issues when running on ZFS](http://lucene.472066.n3.nabble.com/Suggester-uses-lots-of-Page-cache-memory-td4332882.html) - are you using ZFS? There are also multiple efficiency updates in the later versions of Solr (including support for newer JVMs) - did you try to upgrade? – MatsLindh Jun 11 '18 at 21:22
  • I am not using ZFS and right now an upgrade is not possible. Is there no way to solve this in the current release? – Qedrix Jun 12 '18 at 06:34

1 Answers1

0

On the mailing list, the critical info is that the OOME is due to java heap space.

This means that you either need to increase your heap size or reduce the amount of heap that Solr needs. It is not always possible to reduce the heap requirements, but here are some ideas:

https://wiki.apache.org/solr/SolrPerformanceProblems#Reducing_heap_requirements

elyograg
  • 789
  • 3
  • 14
  • The issue is Solr only uses 56% of the memory allotted during build before it fails. I have 150M records and for that do we have an estimate of how much RAM is optimal? When I try with 25M records, the build works. – Qedrix Jun 13 '18 at 14:57
  • That OOME indicates that all the heap memory is full, and that garbage collection wasn't able to free enough of it for a new allocation to succeed. What exactly are you looking at to see 56 percent? We have no generic recommendations on things like heap size. https://lucidworks.com/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ One piece of information that I can give you: With 150 million documents, each entry in the filterCache is going to be about 18 megabytes in size. 512 of those (example cache size is 512) would need 9 gigabytes of heap. – elyograg Jun 13 '18 at 20:18