0

I'm trying to create a monitoring and observability infrastructure over my Hadoop cluster.

my cluster is managed via cloudera manager, because of that I have some questions that maybe some of you could help me with:

  1. where does cloudera saves their metrics? a tsdb?
  2. how does the metrics are collected? exporters?
  3. is there a way for me to use the cloudera manager as data source in Grafana?

And the main question is: what is the right approach for the infrastructure architecture? using jmx exporters for each service jvm? saving all of the metrics in tsdb like Prometheus and querying it in Grafana?

if any more information is needed id be more then happy to provide it.

  • In recent releases, base Cloudera license includes its full-scale [observability platform](https://www.cloudera.com/products/observability.html). Whats wrong with using it ootb, why do you need another solution? – mazaneicha Aug 07 '23 at 21:20
  • Sounds cool but my cluster is on prem and i'm not using cloud service, does the base version includes this? – Liran Eliyahu Aug 08 '23 at 06:28
  • You can publish your telemetry to Cloudera from on-prem clusters too (https://docs.cloudera.com/observability/cloud/configuration/topics/obs-private-architecture.html). And definitely worth checking with vendor rep. if they are planning a version for on-prem deployment any time soon. – mazaneicha Aug 08 '23 at 16:25

1 Answers1

0

AFAIK, yes, Cloudera Manager has its own metrics database (not sure it is open source tool, but monitoring information is in a database). There are Python agents, I think, to collect the metrics...

Yes, there is a Grafana datasource

But JMX Exporters would show a lot more detail than what you see there, however you'll then need to create your own dashboards for that data.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245