I use Solr 4.4, Zookeeper 3.4.5 and Tomcat 7.
CLUSTER SETUP: 3 shards and 3 replica. Totally 6 Solr instances.
Cluster is up and running. Everything seems to be OK. Nothing critical in logs, except few warnings about deprecared classes.
HOW I DO CONFIGURATION UPDATE:
run following command:
java -classpath .:solr-jars/* org.apache.solr.cloud.ZkCLI -cmd upconfig -zkhost ZOOKEEPER_HOST:PORT -d solr-conf -confname myconf
check that config was updated in Zookeeper:
/var/zookeeper/bin/zkCli.sh ZOOKEEPER_HOST:PORT
ls /configs/myconfig/schema.xml
ls /configs/myconfig/solrconfig.xml
collection reload via Solr Collection API
curl HOST/solr/admin/collections?action=RELOAD&name=collection1
Config update seems to be successfully applied to all nodes. But sometimes 1 node in the cluster goes down (marked as brown in Solr Admin UI). Tomcat restart, collection reload doesn't help to bring this node back.
Error message from logs:
SyncStrategy - No UpdateLog found - cannot sync
QUESTIONS:
- Is there any way how I can get failed node back to live? Without needs to remove all data, of course.
- what is the right way to force Solr nodes to accept the configuration after it was updated in Zookeeper? Without Tomcat restart if it possible (it is a production system)
- (optinal) in general what is your filling about SolrCloud stability and predictability? While working with SC, I found really many complains and questions about it from other folks. Doesn't look like a good sign.
UPDATE 1 Looks like error message wasn't related to the actual problem. After I configured transaction log, this error is disappear. But few nodes still go down after collection reload.
The only way to bring it back is to directly edit clusterstate.json in Zookeeper, and change node status to "active". After that, node seems to be OK and stable.