Questions tagged [ganglia]

Ganglia is a scalable distributed system monitor tool for high-performance computing systems such as clusters and grids. It allows the user to remotely view live or historical statistics (such as CPU load averages or network utilization) for all machines that are being monitored.

Ganglia is based on a hierarchical design targeted at federations of clusters. It relies on a multicast-based listen/announce protocol to monitor state within clusters and uses a tree of point-to-point connections amongst representative cluster nodes to federate clusters and aggregate their state. It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization.

References

153 questions
1
vote
1 answer

Publishing mterics to ganglia using gmetric4j

I'm considering using gmetric4j to publish metrics to ganglia. So far the only documented way I found for doing this is to use it's GSampler class to make a Metric data polling Runnable that runs at scheduled times. In my application, though, it…
miljanm
  • 906
  • 7
  • 20
1
vote
2 answers

How to see jvm metrics report in ganglia web

I need to see jvm metrics on ganglia report, I've set up jmxetric on a node and see jvm metrics when using simple 'telnet localhost 8649'. I have ganglia-web with gmond/gmetad running on another machine and it shows standard reports such as…
Igor Semenko
  • 459
  • 1
  • 7
  • 18
1
vote
1 answer

Ganglia - security when polling metrics over TCP (xml format) from nodes

Context: I am a student and I am trying to prepare a proof of concept for quick network-monitoring. our imaginary context is that we have multiple clusters which are on different subnets. I have read numerous documentations regarding ganglia and…
laycat
  • 5,381
  • 7
  • 31
  • 46
1
vote
2 answers

Ganglia - RRD(round robin database) scalability

I've just come across RRD lately by trying out ganglia monitoring system. Ganglia stores the monitoring data in RRD. I am just wondering that, from scalability perspective, how RRD works ? What if I have potentially huge amount of data to store.…
Shengjie
  • 12,336
  • 29
  • 98
  • 139
1
vote
1 answer

How to use cloudera management (ui) console to edit hadoop-metrics.properties?

I am trying to monitor Hbase using Ganglia. How to use cloudera management console to edit dfs.server property in the hadoop-metrics.properties? According to http://wiki.apache.org/hadoop/GangliaMetrics I need to…
user244333
1
vote
2 answers

Ganglia and Amazon Elastic Map Reduce - install issues

Following the instructions for "Initializing Ganglia on a Job Flow" I get my cluster up but don't see any Ganglia process running (on 8157). …
Tom Emmons
  • 103
  • 1
  • 7
0
votes
0 answers

How to create a virtual environment with singularity containers (influxdb, ganglia, prometheus) for generating (ML) train data

I am creating a virtual environment initially using docker containers locally on my Ubuntu machine. Eventually, I will take it to the HPC environment and run it there in much larger scales. The scenario is as follows: I am running a small program…
0
votes
2 answers

Pyspark stuck and not processing. It shows more than 1K processes

I have a for loop running in databricks and the first iterations run fast, then it gets slower and then it doesn't proceeds at all. While I know that is common in for loops if data size is increasing on each iteration and/or there's garbage…
0
votes
0 answers

Spark ganglia report not matching databrick's cluster specifications

I have a databricks cluster on AWS, with minimum two nodes and maximum 8. Here's a picture of my cluster I have cached a dataframe, and under SparkUI on storage tab I see it's 6.7 GB So I would expect that if I go to ganglia's UI, I would see that…
0
votes
1 answer

How to get memory usage and Cpu utilization from cluster

We are using AWS EMR to run spark jobs. From ganglia we see that the memory utilisation of our cluster is low as compared to the allocated memory. This is the case with cpu utilisation as well. We are currently reporting spark metrics by…
0
votes
1 answer

Getting error while pushing Solr Metrics to Ganglia

I am new to Solr and trying to push solr metrics to Ganglia. I have modified solr.xml as below: ${host:} ${jetty.port:8983}
0
votes
1 answer

Crashing: gmetad of Ganglia crashing because of Buffer Overflow

I use Ganglia to monitor Hadoop Flume Agents' performance. For almost 1 year now, it had been working very well. Last week gmetad started crashing with buffer overflow. Only thing that has changed in last few days is we started monitoring more…
Viren
  • 170
  • 1
  • 10
0
votes
1 answer

What does 'sintr' mean (in Ganglia)?

I have created the following view in Ganglia, showing cpu_user stats: Can someone tell me what Sintr means? I was not able to find any information on Google or stackexchange websites. Interestingly, I have two servers with identical hardware that…
andreee
  • 4,459
  • 22
  • 42
0
votes
0 answers

How do I get Nagios plugins to work?

I have installed Nagios successfully, and added the following services: nano /usr/local/nagios/etc/objects/services.cfg define host { use linux-server host_name dcctst1e address …
Kristada673
  • 3,512
  • 6
  • 39
  • 93
0
votes
1 answer

Import error with python gmond_python_modules in Ganglia

I used gmond_python_modules, trying to monitor one cluster having several hosts with each has 8 GPUs. And after the last steps, I tried to restart gmond service on my web node, only to get: Starting GANGLIA gmond: Could not find platform independent…
lincr
  • 1,633
  • 1
  • 15
  • 36