0

Below is the NIFI setup and configuration:

NIFI VERSION: 1.11.4 2 node - zookeeper 1 node - zookeeper & NiFi 4 node - NiFi

There are no changes in nifi configuration or certificate installed on nifi. Suddenly cluster is going down by throwing lack of heart beat messages like mentioned below ERROR1.

ERROR1: when nifi is up

2023-02-13 07:45:31,432 INFO [Process Cluster Protocol Request-16] o.a.n.c.c.node.NodeClusterCoordinator Status of prd000000123:9443 changed from NodeConnectionStatus[nodeId=prd000000123:9443, state=CONNECTED, updateId=122] to NodeConnectionStatus[nodeId=prd000000123:9443, state=DISCONNECTED, Disconnect Code=Lack of Heartbeat, Disconnect Reason=Have not received a heartbeat from node in 88 seconds, updateId=131]

But when I restart NIFI it throws error as ERROR2. If I leave it for 3-4 hours then cluster gets formed. Can someone please help here, why this is happening?

NoConnectedNodesException Cluster failed processing request: org.apache.nifi.cluster.manager.exception.NoConnectedNodesException. Returning Internal Server Error response.

ERROR2: when I am restarting NIFI

2023-02-20 14:31:57,761 INFO [Timer-Driven Process Thread-23] org.eclipse.jetty.server.Server Started @340304ms 2023-02-20 14:31:57,761 INFO [Timer-Driven Process Thread-23] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled ListenHTTP[id=4b17577d-81d0-3bdb-82f0-cdfe0e35376e] to run with 1 threads 2023-02-20 14:32:04,646 INFO [Write-Ahead Local State Provider Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog@650d5a3d checkpointed with 20983 Records and 0 Swap Files in 31 milliseconds (Stop-the-world time = 3 milliseconds, Clear Edit Logs time = 1 millis), max Transaction ID 42650 2023-02-20 14:32:20,347 INFO [NiFi Web Server-20] o.a.n.c.m.e.NoConnectedNodesException Cluster failed processing request: org.apache.nifi.cluster.manager.exception.NoConnectedNodesException. Returning Internal Server Error response. 2023-02-20 14:33:00,083 INFO [pool-15-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile Repository 2023-02-20 14:33:00,086 INFO [pool-15-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile Repository with 0 records in 2 milliseconds 2023-02-20 14:33:55,423 INFO [NiFi Web Server-36939] o.a.n.c.m.e.NoConnectedNodesException Cluster failed processing request: org.apache.nifi.cluster.manager.exception.NoConnectedNodesException. Returning Internal Server Error response. 2023-02-20 14:34:04,688 INFO [Write-Ahead Local State Provider Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog@650d5a3d checkpointed with 20983 Records and 0 Swap Files in 41 milliseconds (Stop-the-world time = 4 milliseconds, Clear Edit Logs time = 2 millis), max Transaction ID 42650 2023-02-20 14:35:00,089 INFO [pool-15-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile Repository 2023-02-20 14:35:00,089 INFO [pool-15-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile Repository with 0 records in 0 milliseconds 2023-02-20 14:35:53,842 INFO [NiFi Web Server-36854] o.a.n.c.m.e.NoConnectedNodesException Cluster failed processing request: org.apache.nifi.cluster.manager.exception.NoConnectedNodesException. Returning Internal Server Error response. 2023-02-20 14:36:04,717 INFO [Write-Ahead Local State Provider Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog@650d5a3d checkpointed with 20983 Records and 0 Swap Files in 28 milliseconds (Stop-the-world time = 4 milliseconds, Clear Edit Logs time = 1 millis), max Transaction ID 42650 2023-02-20 14:37:00,089 INFO [pool-15-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile Repository 2023-02-20 14:37:00,089 INFO [pool-15-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile Repository with 0 records in 0 milliseconds 2023-02-20 14:38:04,749 INFO [Write-Ahead Local State Provider Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog@650d5a3d checkpointed with 20983 Records and 0 Swap Files in 31 milliseconds (Stop-the-world time = 3 milliseconds, Clear Edit Logs time = 1 millis), max Transaction ID 42650 2023-02-20 14:38:50,657 INFO [NiFi Web Server-36854] o.a.n.c.m.e.NoConnectedNodesException Cluster failed processing request: org.apache.nifi.cluster.manager.exception.NoConnectedNodesException. Returning Internal Server Error response. 2023-02-20 14:39:00,089 INFO [pool-15-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile Repository 2023-02-20 14:39:00,089 INFO [pool-15-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile Repository with 0 records in 0 milliseconds 2023-02-20 14:40:04,778 INFO [Write-Ahead Local State Provider Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog@650d5a3d checkpointed with 20983 Records and 0 Swap Files in 25 milliseconds (Stop-the-world time = 4 milliseconds, Clear Edit Logs time = 1 millis), max Transaction ID 42650 /fail

I need the root cause of this issue and why suddenly its happening.

Olaf Kock
  • 46,930
  • 8
  • 59
  • 90

0 Answers0