3

I made a cluster with one master node and three slaves using cloudera CDH 5.8.0. After some configuration work I got all the services healthy but one: HBase. Some few minutes after restarting it gets bad health.

The error displayed from Cloudera Manager is: "Bad : Master summary: This health test is bad because the Service Monitor did not find an active Master". I checked the service monitor logs and I found this warning:

(7 skipped) Exception in doWork for task: hbase_HBASE_SERVICE_STATE_TASK
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=1, exceptions:
Thu Dec 15 09:38:30 CET 2016, RpcRetryingCaller{globalStartTime=1481791110299, pause=100, retries=1},     org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.PleaseHoldException): org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
at org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2303)
at org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:782)
at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55652)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:745)


at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:147)
at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3678)
at org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:2382)
at com.cloudera.cmf.cdh5client.hbase.HConnectionImpl.getClusterStatus(HConnectionImpl.java:50)
at com.cloudera.cmon.firehose.polling.hbase.HbaseServiceState.update(HbaseServiceState.java:158)
at com.cloudera.cmon.firehose.polling.hbase.HbaseServiceStateFetcher.doWork(HbaseServiceStateFetcher.java:42)
at com.cloudera.cmon.firehose.polling.AbstractHConnectionClientTask.doWorkWithClientConfig(AbstractHConnectionClientTask.java:95)
at com.cloudera.cmon.firehose.polling.AbstractHConnectionClientTask.doWorkWithClientConfig(AbstractHConnectionClientTask.java:26)
at com.cloudera.cmon.firehose.polling.AbstractCdhWorkUsingClientConfigs.doWork(AbstractCdhWorkUsingClientConfigs.java:45)
at com.cloudera.cmon.firehose.polling.CdhTask$InstrumentedWork.doWork(CdhTask.java:230)
at com.cloudera.cmf.cdhclient.util.ImpersonatingTaskWrapper.runTask(ImpersonatingTaskWrapper.java:72)
at com.cloudera.cmf.cdhclient.util.ImpersonatingTaskWrapper.access$000(ImpersonatingTaskWrapper.java:21)
at com.cloudera.cmf.cdhclient.util.ImpersonatingTaskWrapper$1.run(ImpersonatingTaskWrapper.java:107)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at com.cloudera.cmf.cdh5client.security.UserGroupInformationImpl.doAs(UserGroupInformationImpl.java:41)
at com.cloudera.cmf.cdhclient.util.ImpersonatingTaskWrapper.doWork(ImpersonatingTaskWrapper.java:103)
at com.cloudera.cmf.cdhclient.CdhExecutor$1.call(CdhExecutor.java:125)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.PleaseHoldException): org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
at org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2303)
at org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:782)
at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55652)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:745)

at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1219)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300)
at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:46458)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$5.getClusterStatus(ConnectionManager.java:2027)
at org.apache.hadoop.hbase.client.HBaseAdmin$28.call(HBaseAdmin.java:2386)
at org.apache.hadoop.hbase.client.HBaseAdmin$28.call(HBaseAdmin.java:2382)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
... 22 more

Is there a known way of solving this issue? I´ve set the master node as HBase Master and the slaves as HBase RegionServers.

daloman
  • 309
  • 3
  • 10

2 Answers2

2

The problems lies in Cloudera Management Monitor Service, not in Hbase itself. What I did is to restart Cloudera Management Monitor Service, and then restart hbase. After that everything seems to be fine.

user3113626
  • 649
  • 8
  • 17
0

Check log file of you hbase master. In my case, the hbase don't have enough permisson to operate “/tmp” directory on HDFS,So I change it and the problem was gone.

Gavin Gu
  • 61
  • 6