0

I have a 3 node replicaset monitored by MMS. The two secondary nodes show up and report data in MMS, the primary does not.

When I add it manually, it shows in the MMS dashboard as "No Data" for about 20 minutes, then disappears. The only data in the log is shown below.

I've tried to increase the logging level of the agent by changing the logLevel to logging.DEBUG in logConfig.py, but was unsuccessful.

2013-08-29 10:46:20,508 INFO starting non-blocking stats monitoring: 10.X.X.X:27018
2013-08-29 10:46:20,508 INFO starting blocking stats monitoring: 10.X.X.X:27018
2013-08-29 10:46:20,509 INFO starting munin monitoring: 10.X.X.X:4949
2013-08-29 11:06:36,707 INFO stopping munin monitoring: 10.X.X.X:4949
2013-08-29 11:06:39,851 INFO stopping blocking stats monitoring: 10.X.X.X:27018
2013-08-29 11:06:39,896 INFO stopping non-blocking stats monitoring: 10.X.X.X:27018

Anyone have any ideas how to troubleshoot why the node disappears or increase the logging level to get additional info from the agent?

Stennie
  • 63,885
  • 14
  • 149
  • 175
  • I've verified connectivity from the MMS agent to the server in question for both the mongo port and the munin-node port. Services are running correctly on the server in question (it's the primary for the replicaset) – Rekibnikufesin Aug 29 '13 at 18:20
  • Currently have 16 other servers successfully reporting into MMS. Agent is running v1.5.9. – Rekibnikufesin Aug 29 '13 at 18:22
  • How did you verify connectivity? If you haven't I would try using the mongo shell from your agent host, making sure to use the same host name used in your replica set configuration (rs.conf()) – James Wahlin Aug 29 '13 at 19:44
  • I verified connectivity of mongo by issuing: `mongo --host 10.X.X.X --port 27018` and succesfully connecting from the MMS agent. It's listed as 10.X.X.X:27018 in the rs.conf() I verified connection for munin-node by issuing `telnet 10.X.X.X 4949` from the monitoring agent and successfully issuing munin fetch commands – Rekibnikufesin Aug 29 '13 at 22:18
  • If this is still an issue for you please post the name of your MMS group - I work for MongoDB and can take a closer look. – James Wahlin Sep 17 '13 at 16:53
  • Thanks James. The MMS group is myList. – Rekibnikufesin Sep 17 '13 at 21:36
  • It is still an issue. As additional info- I added a new replicaset to monitor, all 3 were there and reporting data and after a few hours the primary disappeared. The primary was the host I originally added to get the new replicaset added to MMS – Rekibnikufesin Sep 17 '13 at 21:38
  • This has been fixed for one of your replica sets. Please confirm whether it was the one in question. – James Wahlin Sep 18 '13 at 20:51
  • Yup- I see the last host I added is now listed, but it's not collecting data: it's highlighted in red and last ping status is 5 days ago. – Rekibnikufesin Sep 19 '13 at 17:42

1 Answers1

1

I opened a ticket with Mongo Inc. and was able to get all of my servers reporting into MMS. Not sure which events were the triggers for success, but I'll post all of the details here: After James added one of the servers back in (see previous comments) it was listed in MMS, but not reporting data. I restarted the MMS agent and the server began reporting into MMS.

I also had two other servers - each a member of a different replica set that didn't show up in MMS. Mongo Inc suggested I add the servers manually to MMS using the "+ Add Host" option. Sure enough- they were identified and after about 10 minutes began reporting data.

In summary- it looks like a combination of adding hosts manually and restarting my local MMS agent were necessary keys in resolving this issue.