1

I would like to use graphite to collect metrics from different servers. By default, carbon listens on 2003 on all interfaces, which is fine with me.

Now anyone could theoretically send metric data there. Is there a standard way to prevent this from happening (similar to http base auth ) or do I need to mess with IP based restrictions on the physical interface?

ProfHase85
  • 501
  • 3
  • 6
  • 15

1 Answers1

2

It depends on how much you want to "harden" any Graphite nodes ("Graphite" being whatever mixture topology of carbon-relays, carbon-caches, storage backend, and potentially graphite-web/api).

If you know which hosts in your network should be sending metrics to Graphite (typically relays), you can modify your Graphite host firewall rules to expect either from an explicit list of host IPs or range for the applicable ports. Or you could do something similar at the edge network from firewall or router - I have no advice there as your question doesn't give a fuller scope of what your topology looks like.

An alternative approach would be to use the AMQP support to instead have your nodes publish their metrics to a broker as an authenticated user and then have your Graphite host(s) modify host firewall to only accept TCP 2003 from the broker metrics are being received. The upside here is your Graphite node(s) only need to know what broker metrics will be coming to it from, which drastically simplifies any host firewall rules. Having nodes publish metrics through a lightweight service secures things a bit better in that the "trust" concern you have is taken care of at the top of the flow rather than the eventuality of metrics - legit or not - arriving at your Graphite host(s). RabbitMQ is OSS and pretty simple to get up and going without needing to mess around too much with configuration if you pull in the Management Plugin. Most of its config is opening necessary ports for operation.

However, this makes a simple metric publisher to Graphite topology a bit more complex for the task and firmly establishes a pub/sub model for how your metrics get to Graphite (but does come with a nice side benefit of allowing metrics in transit to not be lost potentially). It also adds yet another host to secure within operational reason.

To go way further, you could implement a log monitoring system to watch carbon-relay's listener.log file as it will write a line for every metric received and processed. At a high level you'd watch that log looking for exceptions to metrics you expect. Like if you have a server.cpu.load metric, you'd expect to see those getting processed, but a metric posted called foo.bar.value is not valid. As a response to such an event, you could simply wipe the corresponding directory Whisper creates for the invalid namespace (if you use Whisper for storage).

Hardening Graphite's carbon-relay and carbon-cache are fine and a smart thing to do, but do not forget it's just as concerning over who can access your Graphite webapp or graphite-api to get those metrics out.

  • Thank you for this extended explanation, I did not know about the AMQP support and it seems like the best way for me, very helpful – ProfHase85 Apr 11 '18 at 15:32
  • Certainly - glad it was useful. The AMQP listener functionality of Graphite is somewhat obscured to the point even Graphite devs have pondered if anyone uses it at all (couldn't dig that comment up). However, I prefer it since it integrates better to my company's own topology and allows us to keep tabs on the flow of traffic from metric producers to the Graphite worker endpoints. Additionally, we stand up different message queues & exchanges for different types of metrics so that they ultimately end up in a different Graphite cluster. The intent is to scope who can see what from a frontend. – Ruhrohshingo Apr 12 '18 at 15:48
  • Regarding the "log monitoring" bit: graphite has a blacklist and whitelist config that are a more elegant and performant solution to the problem of unwanted data in the naming hierarchy. – 7yl4r Dec 20 '18 at 18:20