I have a 1.4.9 network having 3 orderers and 2 peers accordingly running on KAFKA consensus. I wanted to migrate it to RAFT so as prerequisites mentioned I updated channel capabilities for both application channel as well as system channel. After Upgrading Capabilities I Started migration of the consensus by putting the network in Maintenance Mode and updating the Consensus type like this
"ConsensusType": {
"mod_policy": "Admins",
"value": {
"metadata": {
"consenters": [
{
"client_tls_cert": "CERTS",
"host": "orderer1",
"port": 7050,
"server_tls_cert": "CERTS"
},
{
"client_tls_cert": "CERTS",
"host": "orderer2",
"port": 7050,
"server_tls_cert": "CERTS"
},
{
"client_tls_cert": "CERTS",
"host": "orderer3",
"port": 7050,
"server_tls_cert": "CERTS"
}
],
"options": {
"election_tick": 10,
"heartbeat_tick": 1,
"max_inflight_blocks": 5,
"snapshot_interval_size": 20971520,
"tick_interval": "500ms"
}
},
"state": "STATE_MAINTENANCE",
"type": "etcdraft"
},
"version": "1"
}
Applying the correct certificates in both Application channel and system channel. After this i restarted the docker containers and it showed consensus migration has started to RAFT consensus and it starts the election process but it never completes the election process. Log is like this for all the orderers as shown below
This is my orderer config
environment:
- GODEBUG=netdns=go
- ORDERER_HOST=orderer2
- ORDERER_GENERAL_LOGLEVEL=debug
- ORDERER_GENERAL_LISTENADDRESS=0.0.0.0
- ORDERER_GENERAL_GENESISMETHOD=file
- CONFIGTX_ORDERER_ORDERERTYPE=kafka
- CONFIGTX_ORDERER_KAFKA_BROKERS=[kafka0:9092,kafka1:9092,kafka2:9092,kafka3:9092]
- ORDERER_GENERAL_GENESISFILE=/var/hyperledger/orderer/configtx/genesis.block
- ORDERER_GENERAL_LOCALMSPID=OrdererMSP
- ORDERER_GENERAL_LOCALMSPDIR=/var/hyperledger/orderer/msp
- CORE_VM_DOCKER_HOSTCONFIG_NETWORKMODE=hyperledger-ov
- ORDERER_KAFKA_RETRY_SHORTINTERVAL=1s
- ORDERER_KAFKA_RETRY_SHORTTOTAL=30s
- ORDERER_KAFKA_VERBOSE=true
- ORDERER_GENERAL_GENESISPROFILE=eProcureOrdererGenesis
- ORDERER_ABSOLUTEMAXBYTES=10 MB
- ORDERER_PREFERREDMAXBYTES=ORDERER_PREFERREDMAXBYTES=512 KB
# enabled TLS
- ORDERER_GENERAL_TLS_ENABLED=true
- ORDERER_GENERAL_TLS_PRIVATEKEY=/var/hyperledger/orderer/tls/server.key
- ORDERER_GENERAL_TLS_CERTIFICATE=/var/hyperledger/orderer/tls/server.crt
- ORDERER_GENERAL_TLS_ROOTCAS=/var/hyperledger/orderer/msp/cacerts/ca-orderer-7054.pem
- ORDERER_GENERAL_CLUSTER_CLIENTCERTIFICATE=/var/hyperledger/orderer/tls/server.crt
- ORDERER_GENERAL_CLUSTER_CLIENTPRIVATEKEY=/var/hyperledger/orderer/tls/server.key
- ORDERER_GENERAL_CLUSTER_ROOTCAS=[/var/hyperledger/orderer/msp/cacerts/ca-orderer-7054.pem,/var/hyperledger/peers/eprocure1/peer/msp/cacerts/ca-eprocure1-7054.pem,/var/hyperledger/peers/eprocure2/peer/msp/cacerts/ca-eprocure2-7054.pem]
# Client Auth
- ORDERER_GENERAL_TLS_CLIENTAUTHREQUIRED=true
- ORDERER_GENERAL_TLS_CLIENTROOTCAS=[/var/hyperledger/orderer/msp/cacerts/ca-orderer-7054.pem,/var/hyperledger/peers/eprocure1/peer/msp/cacerts/ca-eprocure1-7054.pem,/var/hyperledger/peers/eprocure2/peer/msp/cacerts/ca-eprocure2-7054.pem]
- ORG_ADMIN_CERT=/var/hyperledger/orderer/msp/admincerts/cert.pem
2023-02-16 11:45:06.175 UTC [orderer.consensus.etcdraft] Step -> INFO 07e 1 is starting a new election at term 1 channel=channel node=1
2023-02-16 11:45:06.175 UTC [orderer.consensus.etcdraft] becomePreCandidate -> INFO 07f 1 became pre-candidate at term 1 channel=channel node=1
2023-02-16 11:45:06.180 UTC [orderer.consensus.etcdraft] poll -> INFO 080 1 received MsgPreVoteResp from 1 at term 1 channel=channel node=1
2023-02-16 11:45:06.180 UTC [orderer.consensus.etcdraft] campaign -> INFO 081 1 [logterm: 1, index: 3] sent MsgPreVote request to 2 at term 1 channel=channel node=1
2023-02-16 11:45:06.180 UTC [orderer.consensus.etcdraft] campaign -> INFO 082 1 [logterm: 1, index: 3] sent MsgPreVote request to 3 at term 1 channel=channel node=1
In the beginning of the logs I get this error shown here
2023-02-16 11:44:33.175 UTC [orderer.consensus.etcdraft] logSendFailure -> ERRO 049 Failed to send StepRequest to 2, because: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup orderer2 on 127.0.0.11:53: no such host" channel=testchainid node=1
2023-02-16 11:44:33.175 UTC [orderer.consensus.etcdraft] logSendFailure -> ERRO 04a Failed to send StepRequest to 3, because: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup orderer3 on 127.0.0.11:53: no such host" channel=testchainid node=1
2023-02-16 11:44:33.705 UTC [core.comm] ServerHandshake -> ERRO 04b TLS handshake failed with error tls: first record does not look like a TLS handshake server=Orderer remoteaddress=10.0.0.3:46092
I have added environment variables related to General.Cluster but no Difference was found