Yesterday faced with a nice problem: Nothing happens in case of chaincode container crash or someone manual stopping it.
Sample network (using v1.2.0 images):
- 2 ORGs
- 2 CA's
- 2 peers ORG1 (using LevelDB as a storage)
- 2 peers ORG2 (using LevelDB as a storage)
- 1 solo orderer
- 1 shared channel with consorcium of 2 org's
- this network is launched under the control of Docker Swarm on 4 VM's (2 manager's, 2 worker's nodes)
There are many elements that can break down:
chaincode container crashes (!)- one/two peer's of ORG1 crashes
- orderer crashes
So. The fabric's default behavior:
chaincode container crashes (!)
Stop processing sdk request's. no restart
UPD_1: With the next request (invoke/query) cc-container will be recreated
- one/two peer's of ORG1 crashes
stop processing requests's due to lost connection to channel after auto-restart/start on failure: lost connection to channel if chaincode was instantiated on this peer: crash chaincode container
So. What are the strategies/best practice to restore the Hyperledger Fabric network after crashes?