1

MarkLogic 9.0.8.2

We have configured MarkLogic cluster (3 nodes) as follows PF - primary forest, RF - replica forest

Host1: PF01 RF02 RF03 Host2: PF02 RF01 RF03 Host3: PF03 RF01 RF02

In normal scenario, when we see database status page, PF should be in "open" state and and everything else will be "wait replicating" state.

Now due to some reason, state of RF01 and RF02 became in "open" state, so that means all load is now on Host3

Reindexing is in progress.

Now question is while reindexing is in progress, can we restart RF01-2 & RF02-2, so that PF01 and PF02 are again in "open" state?

Are there any best practices we should follow for forest restart?

Manish Joisar
  • 1,256
  • 3
  • 23
  • 47

2 Answers2

3

See this knowledgbase article What Triggers Failover in MarkLogic Server

If this occurred during a reindex, I suspect that you may not have sufficient resources to support the workload. Depending on what "normal" is for your system, reindexing can cause an increase in resource demands.

Look through your ErrorLog.txt and I suspect you will see signs of trouble. Look for any log events with a severity greater than INFO (NOTICE, WARNING, ERROR, CRITICAL).

You may need to reduce the throttle of your reindexing and consider scaling options to ensure that your cluster is appropriately sized. You might consider opening a MarkLogic Support ticket, if you are eligible.

If you think that the primary hosts are ready to take over, you can "flip the forests" back to the primary by restarting the replica forests that are now open.

See the documentation: Reverting a Failed Over Forest Back to the Primary Host

Also, the following knowledgebase article provides a script that can be executed in order to flip all of the open replicas:

https://help.marklogic.com/Knowledgebase/Article/View/427/0/scripting-failover-flipping-replica-forests-back-to-their-masters-using-xquery

Mads Hansen
  • 63,927
  • 12
  • 112
  • 147
  • Thanks MH for quick detailed reply, so when i should assume that primary host is ready to take over? once reindex is completed ? or i can do in between ? OR i might loose data as mentioned by bosari ? If yes then there will be too much load on Host 3 as all open forests are now serving from this server – Manish Joisar Mar 16 '20 at 13:47
  • 1
    I would take an assessment of the health of your system (looking for any Notice, Warning, or Error messages). You can always disable the re-indexer and wait for things to settle down, then flip the forests, then look to resume. In general, you should be able to flip them, but any time you are causing unavailability and have transactions in flight there is a chance that something could go wrong. If you can control that, minimize risk of errors and issues. – Mads Hansen Mar 16 '20 at 14:33
2

I believe, you should wait until your re-indexing is complete. Once the re-indexing is completed you can flip back forests. As re-indexing started when master was in open state its possible that some documents might not be synced in master forests and re-indexing completed or failed. Those documents might get missed.

bosari
  • 1,922
  • 1
  • 19
  • 38
  • Thanks Bosari for your quick reply. In this case there will be too much load on Host3 as all open forests are now served from this server. – Manish Joisar Mar 16 '20 at 13:48