13

I got this error

Installation failed. Failed to receive heartbeat from agent.

when I was installing cloudera on single node. This is what is in my /etc/hosts file:

127.0.0.1   localhost
192.168.2.131   ubuntu

This is what is in my /etc/hostname file:

ubuntu

And this is the error in my /var/log/cloudera-scm-agent file:

[13/Jun/2014 12:31:58 +0000] 15366 MainThread agent        INFO     To override these variables, use /etc/cloudera-scm-agent/config.ini. Environment variables for CDH locations are not used when CDH is installed from parcels.
[13/Jun/2014 12:31:58 +0000] 15366 MainThread agent        INFO     Re-using pre-existing directory: /run/cloudera-scm-agent/process
[13/Jun/2014 12:31:58 +0000] 15366 MainThread agent        INFO     Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor
[13/Jun/2014 12:31:58 +0000] 15366 MainThread agent        INFO     Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor/include
[13/Jun/2014 12:31:58 +0000] 15366 MainThread agent        ERROR    Failed to connect to previous supervisor.
Traceback (most recent call last):
  File "/usr/lib/cmf/agent/src/cmf/agent.py", line 1236, in find_or_start_supervisor
    self.get_supervisor_process_info()
  File "/usr/lib/cmf/agent/src/cmf/agent.py", line 1423, in get_supervisor_process_info
    self.identifier = self.supervisor_client.supervisor.getIdentification()
  File "/usr/lib/python2.7/xmlrpclib.py", line 1224, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib/python2.7/xmlrpclib.py", line 1578, in __request
    verbose=self.__verbose
  File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 460, in request
    self.connection.request('POST', handler, request_body, self.headers)
  File "/usr/lib/python2.7/httplib.py", line 958, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib/python2.7/httplib.py", line 992, in _send_request
    self.endheaders(body)
  File "/usr/lib/python2.7/httplib.py", line 954, in endheaders
    self._send_output(message_body)
  File "/usr/lib/python2.7/httplib.py", line 814, in _send_output
    self.send(msg)
  File "/usr/lib/python2.7/httplib.py", line 776, in send
    self.connect()
  File "/usr/lib/python2.7/httplib.py", line 757, in connect
    self.timeout, self.source_address)
  File "/usr/lib/python2.7/socket.py", line 571, in create_connection
    raise err
error: [Errno 111] Connection refused
[13/Jun/2014 12:31:58 +0000] 15366 MainThread tmpfs        INFO     Reusing mounted tmpfs at /run/cloudera-scm-agent/process
[13/Jun/2014 12:31:59 +0000] 15366 MainThread agent        INFO     Trying to connect to newly launched supervisor (Attempt 1)
[13/Jun/2014 12:31:59 +0000] 15366 MainThread agent        INFO     Successfully connected to supervisor
[13/Jun/2014 12:31:59 +0000] 15366 MainThread _cplogging   INFO     [13/Jun/2014:12:31:59] ENGINE Bus STARTING
[13/Jun/2014 12:31:59 +0000] 15366 MainThread _cplogging   INFO     [13/Jun/2014:12:31:59] ENGINE Started monitor thread '_TimeoutMonitor'.
[13/Jun/2014 12:31:59 +0000] 15366 MainThread _cplogging   INFO     [13/Jun/2014:12:31:59] ENGINE Serving on ubuntu:9000
[13/Jun/2014 12:31:59 +0000] 15366 MainThread _cplogging   INFO     [13/Jun/2014:12:31:59] ENGINE Bus STARTED
[13/Jun/2014 12:31:59 +0000] 15366 MainThread __init__     INFO     New monitor: (<cmf.monitor.host.HostMonitor object at 0x305b990>,)
[13/Jun/2014 12:31:59 +0000] 15366 MainThread agent        WARNING  Setting default socket timeout to 30!
[13/Jun/2014 12:31:59 +0000] 15366 MonitorDaemon-Scheduler __init__     INFO     Monitor ready to report: ('HostMonitor',)
[13/Jun/2014 12:31:59 +0000] 15366 MainThread agent        INFO     Using parcels directory from server provided value: /opt/cloudera/parcels
[13/Jun/2014 12:31:59 +0000] 15366 MainThread parcel       INFO     Agent does create users/groups and apply file permissions
[13/Jun/2014 12:31:59 +0000] 15366 MainThread downloader   INFO     Downloader path: /opt/cloudera/parcel-cache
[13/Jun/2014 12:31:59 +0000] 15366 MainThread parcel_cache INFO     Using /opt/cloudera/parcel-cache for parcel cache
[13/Jun/2014 12:31:59 +0000] 15366 MainThread agent        INFO     Active parcel list updated; recalculating component info.
[13/Jun/2014 12:32:04 +0000] 15366 Monitor-HostMonitor throttling_logger INFO     Using java location: '/usr/lib/jvm/java-7-oracle-cloudera/bin/java'.
[13/Jun/2014 12:32:04 +0000] 15366 Monitor-HostMonitor throttling_logger ERROR    Failed to collect NTP metrics
Traceback (most recent call last):
  File "/usr/lib/cmf/agent/src/cmf/monitor/host/ntp_monitor.py", line 39, in collect
    result, stdout, stderr = self._subprocess_with_timeout(args, self._timeout)
  File "/usr/lib/cmf/agent/src/cmf/monitor/host/ntp_monitor.py", line 32, in _subprocess_with_timeout
    return subprocess_with_timeout(args, timeout)
  File "/usr/lib/cmf/agent/src/cmf/monitor/host/subprocess_timeout.py", line 40, in subprocess_with_timeout
    close_fds=True)
  File "/usr/lib/python2.7/subprocess.py", line 679, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1249, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory
[13/Jun/2014 12:32:12 +0000] 15366 Monitor-HostMonitor throttling_logger ERROR    Timeout with args ['/usr/lib/jvm/java-7-oracle-cloudera/bin/java', '-classpath', '/usr/share/cmf/lib/agent-5.0.2.jar', 'com.cloudera.cmon.agent.DnsTest']
None
[13/Jun/2014 12:32:12 +0000] 15366 Monitor-HostMonitor throttling_logger ERROR    Failed to collect java-based DNS names
Traceback (most recent call last):
  File "/usr/lib/cmf/agent/src/cmf/monitor/host/dns_names.py", line 67, in collect
    result, stdout, stderr = self._subprocess_with_timeout(args, self._poll_timeout)
  File "/usr/lib/cmf/agent/src/cmf/monitor/host/dns_names.py", line 49, in _subprocess_with_timeout
    return subprocess_with_timeout(args, timeout)
  File "/usr/lib/cmf/agent/src/cmf/monitor/host/subprocess_timeout.py", line 81, in subprocess_with_timeout
    raise Exception("timeout with args %s" % args)
Exception: timeout with args ['/usr/lib/jvm/java-7-oracle-cloudera/bin/java', '-classpath', '/usr/share/cmf/lib/agent-5.0.2.jar', 'com.cloudera.cmon.agent.DnsTest']
Zong
  • 6,160
  • 5
  • 32
  • 46
user3613796
  • 161
  • 1
  • 1
  • 4

4 Answers4

8

I am facing similar problems. I've found how to solve the problem:

ERROR    Failed to collect NTP metrics

It's because NTP service is not installed/started. Try:

sudo apt-get update && sudo apt-get install ntp
sudo service ntp start
Mikhail Golubtsov
  • 6,285
  • 3
  • 29
  • 36
2

Got the same error, please assure that your hostname could be translated to your ip. Run ifconfig -a lookup your ip address for eth0, then run dig or host command using your FQDN and review the ip address is the same that ifconfig shows.

Follow this tutorial from cloudera: http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_11_1.html

2

When installing Cloudera 5.2 on AWS, this error occurs. It's a known issue and Cloudera put the workaround on their website (copied here):

Installing on AWS, you must use private EC2 hostnames. When installing on an AWS instance, and adding hosts using their public names, the installation will fail when the hosts fail to heartbeat.

Workaround: Use the Back button in the wizard to return to the original screen, where it prompts for a license.

Rerun the wizard, but choose "Use existing hosts" instead of searching for hosts. Now those hosts show up with their internal EC2 names.

Continue through the wizard and the installation should succeed.

user3303554
  • 2,215
  • 2
  • 16
  • 16
-8

Ensure that the host's hostname is configured properly. Ensure that port 7182 is accessible on the Cloudera Manager server (check firewall rules). Ensure that ports 9000 and 9001 are free on the host being added. Check agent logs in /var/log/cloudera-scm-agent/ on the host being added (some of the logs can be found in the installation details).