0

We have Datastax Enterprise installed in a Rackspace Cloud Instance.

We configured a Test Cluster with a first node with no problem. Then we created a new instance in the Rackspace Cloud and installed same and last Datastax version.

But when trying to build a second node in the cluster it returns the following error:

(1) Error: Installation stage failed: The following packages are already installed: dse-full, dse-pig, dse-libpig, dse-libsolr, dse-libtomcat, dse-libsqoop, dse-liblog4j, dse-libmahout, dse-demos, dse-hive, dse-libhive, dse, dse-libhadoop-native, dse-libhadoop, dse-libcassandra

To try to solve the problem, we deleted the packages and we try to add the new node again to the cluster.

Installation script runs, we get this error:

(2) Installed Errored: The installed agent doesn't seem to be responding.

If we review the Activity Console of the new node (server), it seems the Opscenter Agent is running, but we get the same above error (2).

Sven Delmas
  • 832
  • 5
  • 18
  • So how are you installing the software (are you calling yum/apt-get directly, are you using opscenter, what installation script are you referring too)? Is the second instance a fresh OS image, or is it based upon the first machine? – Sven Delmas Dec 04 '13 at 16:28
  • The second instance is a fresh OS image, we have ubuntu server 12.04.3 installed. For the first error, we used apt-get directly from the new instance, we choose dse-full and opscenter packages. Then we attempt to add the new node from the opscenter installed in the first node. The installation script is the one opscenter runs after start the "adding a new node process" from the first server. Thanks in advance. – CorporateO Dec 04 '13 at 18:21

1 Answers1

2

OpsCenter determines whether an agent has come up based on its ability to ping the /alive route of the agent's http API, which runs by default on port 61621. If you change the log level in opscenterd.conf, you will see the http request being made (It starts with "Performing HTTP request"). You'll want to ensure the IP and port used in that request are accessible from the opscenterd machine.

If they are, you'll want to verify the agent is running properly, and check the agent.log for any errors.

As a last resort, you can try uninstalling the agent as well before attempting to add the node again.

mbulman
  • 470
  • 2
  • 7
  • We verified ports and they are correctly configured, but in the agent.log appears following error: ------ ERROR [StompConnection receiver] 2013-12-05 19:36:46,803 Error connecting via JMX: java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableExcept$ java.net.ConnectException: Connection refused] ------- – CorporateO Dec 05 '13 at 20:05
  • By default JMX binds to all interfaces, and the agent attempts to connect to 127.0.0.1 on port 7199. OpsCenter doesn't change any of this during provisioning, so if the agent is unable to connect over JMX, my guess is the DSE/Cassandra process is not up. Can you verify that you can telnet to 7199 locally? – mbulman Dec 05 '13 at 21:20
  • We are getting other errors. Let’s Recap: Initially we had a server with Apache Cassandra installed with a cluster configured, when running internal (massive record creation) process the server had memory overflows which caused our service was down. So we decided to change to DSE. We installed DSE, run again the internal process and it was working fine, until compactation process used all the disk space of the server- reason why we decided to add a new node to the cluster, we used this documentation: http://www.datastax.com/docs/datastax_enterprise3.2/deploy/multi_dc_install – CorporateO Dec 06 '13 at 19:40
  • But installation was not succsessful, so we tried to add the node through opscenter and we started getting errors mentioned at the beginning of this question. We thought if we installed the opscenter and DSE in the new node, the agent would work. But we started getting errors. – CorporateO Dec 06 '13 at 19:41
  • But installation was not succsessful,so we tried to add the node through opscenter and we started getting errors mentioned at the beginning of this question. We thought if we installed the opscenter and DSE in the new node, the agent would work. But we started getting errors. --- We wonder if there is a reference or list of requirements to create a cluster in a proper way and then add a new node in different servers using OPSCENTER??* this because we are now getting this error when **Start stage failed: Failed to start node 67.23.43.22:/usr/sbin/service dse start failed* trying to do it. – CorporateO Dec 06 '13 at 19:42
  • If it's getting to the point where it's trying to start dse, at least we've gotten farther than your previous problem. Check /var/log/cassandra/system.log and /var/log/cassandra/output.log for any startup errors – mbulman Dec 12 '13 at 17:59
  • Thanks, We have solved the problem. When you tell the Op Center Agent to install a new node, it copy the Datastax agent files to the new instance and begins the DSE server installation process. When the agent attempt to start the DSE service there is an error with the memory of Java virtual machine. This error is fixed by Re-installing and overwrite the DSE files through the APT-Get appication and starting the service manually. We suppose the Datastax agent is not configuring properly the JRE. Thanks for your support. – CorporateO Dec 19 '13 at 20:24