1

I want to understand the write operation in detail for SolrCloud and had a few questions on the architecture:

  1. Does Zookeeper send document write request to all leaders?

    solrwiki:Each shard can exist in multiple copies; these copies of the same shard are called replicas. One of the replicas within a shard is the leader, designated by a leader-election process.

  2. Cloud has leaders and replicas so do all leaders run the hashing process described below before indexing a document or is a particular leader responsible for it?

    Solr Wiki: document ID is used to calculate the hash Solr uses to determine the shard a document is sent to for indexing.

  3. if document indexing fails due to some reason(leader goes down) then does slave node try to reindex that document or what is failover mechanism?

  4. The write operation is considered as completed only when all replicas within a shard are successfully indexed the document. true or false?

kellyfj
  • 6,586
  • 12
  • 45
  • 66
Rahul Sharma
  • 5,614
  • 10
  • 57
  • 91

1 Answers1

1

Here is my understanding

1) ZooKeeper does not write any documents to SolrCloud. ZooKeeper is a resource used by each SolrCloud node to store shared configurations and to keep track of shared state of each node to help elect a leader and monitor replica state. ZooKeeper is not involved in any querying of any collections or any updates. See also https://stackoverflow.com/a/19628852/277023

2) At least for the SolrJ client the choice of which shard to write the node to is done by the client not by the leader See here and see https://lucene.apache.org/solr/guide/7_0/shards-and-indexing-data-in-solrcloud.html for more details

3) I do not know the answer to that question

4) The write operation is considered successful as follows

Transaction logs are integral to the data guarantees of Solr4, and also a place people get into trouble, so let’s talk about them a bit. The indexing flow in SolrCloud is as follows: Incoming documents are received by a node and forwarded to the proper leader. From the leader they’re sent to all replicas for the relevant shard. The replicas respond to their leader. The leader responds to the originating node. After all the leaders have responded, the originating node replies to the client. At this point, all documents have been flushed to the tlog for all the nodes in the cluster!

From

https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

I hope that helps

kellyfj
  • 6,586
  • 12
  • 45
  • 66