3

I am planning to use AWS autoscaling groups for my webservers. As a monitoring solution I am using munin at the moment. In the configuration file on the munin master server, you have to give IP addresses or host names for every host you want to monitor.

Now with autoscaling the number of instances will change frequently, and writing static information in the munin config does not seem to fit well in this environment. I could probably query all server addresses I want to monitor and write the munin master configuration file then, but this seems not like a good approach to me.

What is the preferred way of using munin in such an environment? Does someone use munin with autoscaling?

In general I would like to keep using munin and not switch to another monitoring solution because I wrote quite a lot of specific plugins that I rely on. However if you have another monitoring solution that will probably let me keep my plugins I am also open for that.

j0nes
  • 8,041
  • 3
  • 37
  • 40
  • Good problem. I am following this post. Also I am looking for solution. – Jeevan Dongre Oct 23 '13 at 05:51
  • Did you solve the issue? Can you share your conclusions? – myroslav Sep 16 '15 at 22:33
  • No, not solved yet. We did not implement autoscaling, but for our static servers we still use munin. However for alerting and corresponding actions we found that Cloudwatch was way better suited. – j0nes Sep 17 '15 at 06:57

2 Answers2

0

One year ago we used munin as alternative monitoring system and I will tell you one: I don't like it at all. We had some automation for auto scaling system in nagios too, but this is also ugly way to monitor large amount of AWS instances because nagios starts to lag/crash after some amount of monitoring instances.

If you have more that 150-200 instances to monitor I suggest you to use some commercial services like StackDriver or other alternatives.

Peycho Dimitrov
  • 1,317
  • 1
  • 7
  • 6
0

I stumbled across this old topic because I was looking for a solution to the same problem. Finally I found a way that works for me which I would like to share with you. The tl;dr summary

  • use AWS Python API to get all instances in the same VPC the munin master is in
  • test if munin port 4949 is open on the instances found to detect munin nodes
  • create munin.conf from a munin.base.conf (without nodes) and append entries for all the nodes found
  • run the script on the munin master all 5 minutes via cron

Finally, here is my Python script which does all the magic:

#! /usr/bin/python

import boto3
import requests
import argparse
import shutil
import socket

socketTimeout = 2

ec2 = boto3.client('ec2')


def getVpcId():

        response = requests.get('http://169.254.169.254/latest/meta-data/instance-id')
        instance_id = response.text

        response = ec2.describe_instances(
                Filters=[
                        {
                                'Name' : 'instance-id',
                                'Values' : [ instance_id ]
                        }
                ]
        )

        return response['Reservations'][0]['Instances'][0]['VpcId']




def findNodes(tag):

        result = []

        vpcId = getVpcId()

        response = ec2.describe_instances(
                Filters=[
                        {
                                'Name' : 'tag-key',
                                'Values' : [ tag ]
                        },
                        {
                                'Name' : 'vpc-id',
                                'Values' : [ vpcId ]
                        }
                ]
        )

        for reservation in response['Reservations']:
                for instance in  reservation['Instances']:
                        result.append(instance)

        return result


def getInstanceTag(instance, tagName):

        for tag in instance['Tags']:
                if tag['Key'] == tagName:
                        return tag['Value']

        return None


def isMuninNode(host):

        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        s.settimeout(socketTimeout)

        try:
                s.connect((host, 4949))
                s.shutdown(socket.SHUT_RDWR)
                return True
        except Exception as e:
                return False
        finally:
                s.close()

def appendNodesToConfig(nodes, target, tag):

        with open(target, "a") as file:
                for node in nodes:
                        hostname = getInstanceTag(node, tag)

                        if hostname.endswith('.'):
                                hostname = hostname[:-1]

                        if hostname <> None and isMuninNode(hostname):
                                file.write('[' + hostname + ']\n')
                                file.write('\taddress ' + hostname + '\n')
                                file.write('\tuse_node_name yes\n\n')


parser = argparse.ArgumentParser("muninconf.py")
parser.add_argument("baseconfig", help="base munin config to append nodes to")
parser.add_argument("target", help="target munin config")
args = parser.parse_args()
base = args.baseconfig
target = args.target


shutil.copyfile(base, target)

nodes = findNodes('CNAME')
appendNodesToConfig(nodes, target, 'CNAME')

For the API calls to work you have to setup AWS API credentials or assign an IAM role with the required permissions (ec2:DescribeInstances as a bare minimum) to your munin master instance (which is my prefered method).

Some final implementation notes:

I have a tag named CNAME assigned to all my AWS instances which holds the internal DNS host name. Therefore I filter for this tag and use the value as the node name and address for the munin configuration. You probably have to change this for your setup.

Another option would be to assign a specific tag to all the instances you want to monitor with munin. You could then filter for this tag and probably also skip the check for the open munin port.

Hope this is of some help.

Cheers, Oliver

oros
  • 66
  • 1
  • 3