Slow Solr startup decompressing stored fields

Question

I have an embedded solr server that I use in combination with Spring Data Solr. I have about 600k documents taking up 3GB. During startup Solr takes several minutes before the first query can execute. With VisualVM I've been able to track down the bottleneck which seems to be loading the first document where LZ4 decompression takes a long time reading from disk. The trace looks like this:

searcherExecutor-5-thread-1
    java.lang.Thread.run()
     java.util.concurrent.ThreadPoolExecutor$Worker.run()
      java.util.concurrent.ThreadPoolExecutor.runWorker()
       java.util.concurrent.FutureTask.run()
        java.util.concurrent.FutureTask$Sync.innerRun()
         org.apache.solr.core.SolrCore$5.call()
          org.apache.solr.handler.component.SuggestComponent$SuggesterListener.newSearcher()
           org.apache.solr.spelling.suggest.SolrSuggester.reload()
            org.apache.solr.spelling.suggest.SolrSuggester.build()
             org.apache.lucene.search.suggest.Lookup.build()
              org.apache.lucene.search.suggest.analyzing.AnalyzingSuggester.build()
               org.apache.lucene.search.suggest.DocumentDictionary$DocumentInputIterator.next()
                org.apache.lucene.index.IndexReader.document()
                 org.apache.lucene.index.BaseCompositeReader.document()
                  org.apache.lucene.index.SegmentReader.document()
                   org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument()
                    org.apache.lucene.codecs.compressing.CompressionMode$4.decompress()
                     org.apache.lucene.codecs.compressing.LZ4.decompress()
                      org.apache.lucene.store.BufferedIndexInput.readBytes()
                       org.apache.lucene.store.BufferedIndexInput.readBytes()
                        org.apache.lucene.store.BufferedIndexInput.refill()
                         org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal()
                          java.io.RandomAccessFile.seek[native]()

I need the stored fields for the object mapping. I don't understand why so much decompression needs to happen when loading a single document. It's like the decompression lookup table is huge. Any tips/advice?

enter image description here

score 1 · Answer 1 · answered Jul 07 '14 at 21:12

1

I disabled the Suggester component and spellchecker, it's faster now.

answered Jul 07 '14 at 21:12

Kafkaesque

1,233
9
13

Slow Solr startup decompressing stored fields

1 Answers1