Suggestions for Hibernate Search Cluster and OGM MassIndexer Tuning

Question

Last days I'm working out in building a productive Hibernate Search Cluster using jgroups HS backend, infinispan directory provider(soft-index-file-store) on a MongoDB (around 30million records). Using OGM massIndexer in a standalone local wildfly was working well with almost no configurations for indexing. Nevertheless, now that I tried to put it in a remote linux cluster, even if I'm using the configurations I saw in several questions (like Indexing huge table with Hibernate Search).

But as I see with OGM MassIndexer I can't use a custom configuration:

2017-12-20 16:58:12,855 WARN  [org.hibernate.ogm.massindex.impl.OgmMassIndexer] (default task-1) OGM000031: OgmMassIndexer doesn't support the configuration option 'threadsToLoadObjects'. Its setting will be ignored.
2017-12-20 16:58:12,854 WARN  [org.hibernate.ogm.massindex.impl.OgmMassIndexer] (default task-1) OGM000031: OgmMassIndexer doesn't support the configuration option 'idFetchSize'. Its setting will be ignored.
2017-12-20 15:19:10,194 WARN  [org.hibernate.ogm.massindex.impl.OgmMassIndexer] (default task-1) OGM000031: OgmMassIndexer doesn't support the configuration option 'threadsToLoadObjects'. Its setting will be ignored.

Doing some digging I found THIS and understood that these features are only for the NON OGM massIndexer, so I can't configure properties in order to optimize the batch indexing job.

Last tries I always get a GC overhead limit exceeded:

[Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] [Server:server-one] 17:18:26,987 ERROR [org.hibernate.search.exception.impl.LogErrorHandler] (Hibernate OGM: BatchIndexingWorkspace-1) HSEARCH000058: HSEARCH000116: Unexpected error during MassIndexer operation: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:720) at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:734) at org.apache.lucene.index.IndexWriter.getAnalyzer(IndexWriter.java:1163) at org.hibernate.search.backend.impl.lucene.IndexWriterDelegate.(IndexWriterDelegate.java:39) at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriterDelegate(AbstractWorkspaceImpl.java:217) at org.hibernate.search.backend.impl.lucene.LuceneBackendTaskStreamer.doWork(LuceneBackendTaskStreamer.java:44) at org.hibernate.search.backend.impl.lucene.WorkspaceHolder.applyStreamWork(WorkspaceHolder.java:74) at org.hibernate.search.indexes.spi.DirectoryBasedIndexManager.performStreamOperation(DirectoryBasedIndexManager.java:103) at org.hibernate.search.backend.impl.StreamingOperationExecutorSelector$AddSelectionExecutor.performStreamOperation(StreamingOperationExecutorSelector.java:106) at org.hibernate.search.backend.impl.batch.DefaultBatchBackend.sendWorkToShards(DefaultBatchBackend.java:73) at org.hibernate.search.backend.impl.batch.DefaultBatchBackend.enqueueAsyncWork(DefaultBatchBackend.java:49) at org.hibernate.ogm.massindex.impl.TupleIndexer.index(TupleIndexer.java:111) at org.hibernate.ogm.massindex.impl.TupleIndexer.index(TupleIndexer.java:89) at org.hibernate.ogm.massindex.impl.TupleIndexer.runIndexing(TupleIndexer.java:202) at org.hibernate.ogm.massindex.impl.TupleIndexer.run(TupleIndexer.java:192) at org.hibernate.ogm.massindex.impl.OptionallyWrapInJTATransaction.consumeInTransaction(OptionallyWrapInJTATransaction.java:128) at org.hibernate.ogm.massindex.impl.OptionallyWrapInJTATransaction.consume(OptionallyWrapInJTATransaction.java:97) at org.hibernate.ogm.datastore.mongodb.MongoDBDialect.forEachTuple(MongoDBDialect.java:762) at org.hibernate.ogm.dialect.impl.ForwardingGridDialect.forEachTuple(ForwardingGridDialect.java:168) at org.hibernate.ogm.massindex.impl.BatchIndexingWorkspace.run(BatchIndexingWorkspace.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.lucene.codecs.lucene50.Lucene50PostingsWriter.newTermState(Lucene50PostingsWriter.java:174) at org.apache.lucene.codecs.lucene50.Lucene50PostingsWriter.newTermState(Lucene50PostingsWriter.java:57) at org.apache.lucene.codecs.PushPostingsWriterBase.writeTerm(PushPostingsWriterBase.java:166) at org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.write(BlockTreeTermsWriter.java:1041) at org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.write(BlockTreeTermsWriter.java:456) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.write(PerFieldPostingsFormat.java:198) at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:105) at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:193) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:95) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4086) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3666) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)

How I call the MassIndexer:

@PersistenceContext(name = "ogm-persistence")
EntityManager em;

public void createIndex() throws InterruptedException {
FullTextEntityManager ftem = Search.getFullTextEntityManager(em);
ftem.createIndexer(EventEntity.class)
 .batchSizeToLoadObjects(30)
 .threadsToLoadObjects(4)
 .cacheMode(CacheMode.NORMAL)
 .startAndWait();
}

my persistence.xml

<property name="hibernate.transaction.jta.platform" value="JBossAS" />
<property name="hibernate.ogm.datastore.provider" value="mongodb"/>
<property name="hibernate.ogm.datastore.database" value="*****"/>
<property name="hibernate.ogm.datastore.host" value="*******"/>
<property name="hibernate.ogm.datastore.port" value="27017"/>
<property name="hibernate.search.default.directory_provider" value="infinispan"/>
<property name="hibernate.search.default.worker.backend" value="jgroups"/>
<property name="hibernate.search.default.exclusive_index_use" value="false"/>
<property name="hibernate.search.lucene_version" value="LUCENE_CURRENT"/>
<property name="hibernate.search.default.optimizer.operation_limit.max" value="10000"/>
<property name="hibernate.search.default.optimizer.transaction_limit.max" value="1000"/>
<property name="hibernate.search.worker.execution" value="sync"/>
<property name="hibernate.search.reader.strategy" value="shared"/>
<property name="hibernate.search.infinispan.chunk_size" value="300000000"/>
<property name="wildfly.jpa.hibernate.search.module" value="none"/>
<property name="hibernate.search.infinispan.configuration_resourcename" value="infinispan-config.xml"/>

my infinispan-config.xml

<cache-container name="hibernate-search" jndi-name="java:jboss/infinispan/container/hibernate-search">
    <transport lock-timeout="330000"/>
    <replicated-cache name="LuceneIndexesMetadata" mode="SYNC" remote-timeout="330000" >
        <locking striping="false" acquire-timeout="330000" concurrency-level="500"/>
        <transaction mode="NONE"/>
        <expiration max-idle="-1"/>
        <state-transfer timeout="480000"/>
        <persistence passivation="true">
            <soft-index-file-store xmlns="urn:infinispan:config:store:soft-index:8.0" preload="true" fetch-state="true" >
                <index path="/var/LuceneIndexesMetadata/index" />
                <data path="/var/LuceneIndexesMetadata/data" />
                <write-behind/>
            </soft-index-file-store>
        </persistence>
    </replicated-cache>
    <replicated-cache name="LuceneIndexesData" mode="SYNC" remote-timeout="25000">
        <locking striping="false" acquire-timeout="330000" concurrency-level="500"/>
        <state-transfer timeout="480000"/>
        <transaction mode="NONE"/>
        <eviction strategy="LRU" max-entries="500"/>
        <expiration max-idle="-1"/>
        <persistence passivation="true">
            <soft-index-file-store xmlns="urn:infinispan:config:store:soft-index:8.0" preload="true" fetch-state="true">
                <index path="/var/LuceneIndexesData/index" />
                <data sync-writes="true" path="/var/LuceneIndexesData/data" />
                <write-behind/>
            </soft-index-file-store>
        </persistence>
    </replicated-cache>
    <replicated-cache name="LuceneIndexesLocking" mode="SYNC" remote-timeout="25000">
        <locking striping="false" acquire-timeout="330000" concurrency-level="500"/>
        <transaction mode="NONE"/>
        <expiration max-idle="-1"/>
        <state-transfer  timeout="480000"/>
        <persistence passivation="true">
            <soft-index-file-store xmlns="urn:infinispan:config:store:soft-index:8.0" preload="true" fetch-state="true">
                <index path="/var/LuceneIndexesLocking/index" />
                <data path="/var/LuceneIndexesLocking/data" />
                <write-behind/>
            </soft-index-file-store>
        </persistence>
    </replicated-cache>
</cache-container>

I need to be able to ensure that the indexing will be fine for even larger number of records that 30million, synchronize the index without problems in case a new stateless node is starting and be able to restart without rebuilding the whole index(persisted Index). Any suggestions accepted for possible architectures and changes in my code.

Thanks a lot.

Wildfly 10, Hibernate Search 5.6.1, Infinispan 8.2.5 from the BOM OGM 5.1

Update:

This is a picture of VisualVM when I get the error: Java Heap Space

This is the heap dump file that was produced by the VisualVM: heapdump

Could you use Visual VM to get a memory dump and take a look at it? Might be interesting to check that we don't have a memory leak somewhere. You said it worked locally, was it with the same data? If so there's probably something you should tune on your remote server (Xmx maybe?). — Guillaume Smet, Dec 21 '17 at 10:30
Yes it was with exactly the same database. I tried with several wildfly xmx xms settings(from 500mb to 4000mb). It's like this doesn't affect my issue at all. I 'll try VisualVm tomorrow and tell you about my results. Thanks a lot for your answer. — Panos, Dec 21 '17 at 16:46
Having a quick read of your configuration it looks fine (I'd need to replicate it all to be sure so I might be wrong) - but I suspect this setup wouldn't simply require more than 4GB of heap: you're allowing it to store 500 index entries of data in heap, and each entry is allowed to be very large. Might be worth generating that same data on disk first to have an idea of actual memory requirements. Keep in mind Infinispan will also need multiple copies of the entries around, so add some additional ~30% to the estimates. — Sanne, Dec 21 '17 at 18:50
@GuillaumeSmet I uploaded a picture and a file, in case you want to have a look. For me it's strange that GC works well and not even hits high, there must be something else :/ . I 'm working it out the whole day with different configs, but still unexpected results. Anyway, VisualVM was a nice tool to know, thanks a lot for the recommendation and for your help! — Panos, Dec 22 '17 at 15:29
@Sanne This is an approach I didn't think of, ie the chunk and the eviction values are that I have to test the most to come to a stable solution. I made some tests but still I'm having a hard time in replicating and indexing(both memory, locks issues). While I was looking for a solution I hit this thread https://developer.jboss.org/thread/273679 and this https://stackoverflow.com/questions/23544160/infinispan-distributed-cluster-with-shared-index . Seems that the possible configurations are so many that it's hard to find a stable one for my purpose. I appreciate your help a lot! Thanks!! — Panos, Dec 22 '17 at 15:33
FYI, the problem was in domain mode wildfly. Using it locally was in standalone mode, visualVM results were much better. Faster indexing, and strong cpu usage and memory(indexing was using the resources available and it was obvious). Then I gave a try to standalone-ha.xml to the remote machine and the result was effective the same.Trying to index in domain mode, the spikes of GC and CPU were low, and I was always getting the exception mentioned above and some other heap exceptions. Don't know how could it be that much different since I used domain mode with two machines. Thanks a lot! — Panos, Jan 04 '18 at 08:48

Suggestions for Hibernate Search Cluster and OGM MassIndexer Tuning

0 Answers0