I have a requirement to collect custom metrics (a shell script) from a logical server cluster every five minutes. This server cluster consists of 10 virtual machines. Once option i am familiar with is creating cron jobs on every one of these VMs and report result to an endpoint. I feel this option is little cumbersome to maintain if custom metrics script keeps changing. Are there any other options available to make this process more user friendly and maintainable?
3 Answers
People are naturally going to suggest things like Ansible, and that's a good idea if you're doing multiple things like this on a regular basis, but suppose you don't need all that right now.
Running a script on another machine is actually pretty easy:
rsync script.sh user@${servername}:/path/of/script/
ssh user@${servername} /path/of/script/script.sh
So is running it on all the machines:
for servername in server1 server2 server3; do
rsync script.sh user@${servername}:/path/of/script/
ssh user@${servername} /path/of/script/script.sh
done
You can put a script which does that in cron on a single machine and now you're maintaining one crontab and one copy of the script which is automatically replicated.

- 636
- 6
- 6
Managing deployment of similar things and similar configurations on multiple machines sounds like a great reason to learn some Ansible.
The base requirements on the endpoints are ssh access and a Python installation (nowadays you really should make sure they can run Python 3 - Python 2 will EOL on Jan 1st 2020).

- 5,031
- 2
- 9
- 18
-
Thanks for the response. Ansible is a great solution for running in once. However, in my case - i am looking to execute the metrics collection script every five minutes like a cron job (_collect data from server cluster every five minutes_). – kn9 Aug 12 '19 at 19:33
-
Of course - I was thinking more in terms of using Ansible to simplify deployment of the scripts and of the cron configuration. This way you can change the scripts in one place when required, and use features like templates and host groups for changes required on only some of your machines, if applicable. – Mikael H Aug 12 '19 at 19:50
-
I see, OK - Thanks for the clarification. I will give it a try. – kn9 Aug 12 '19 at 20:01
You can try to use tools like:
- Zabbix
- Icinga / Nagios / check_mk
- Cacti
- Prometheus / Grafana
Highly depends on your needs. Personally worked with many of the listed. Liked the server/agent model of Zabbix which makes it very flexible in monitoring larger fleets.

- 3,908
- 7
- 10
-
Thanks. Could you link me to documentation on using zabbix agent to invoke a shell script periodically? I did take a look at Zabbix - https://www.zabbix.com/documentation/4.0/manual/config/items/itemtypes/external - the warning on the page didn't give me enough confidence to take that route. _Do not overuse external checks! As each script requires starting a fork process by Zabbix server, running many scripts can decrease Zabbix performance a lot._ – kn9 Aug 12 '19 at 19:38
-
Well, I think this might be misleading. The performance drawbacks are related to forks, and I guess that would be completely irrelevant for 10 machines and a short running check. Zabbix is able to handle thousands of machines, and if the script takes long to execute, or intervals are shorter than script duration this might become a real issue. – hargut Aug 12 '19 at 20:09
-
The way we used zabbix, was always in combination with agents on the clients. In such setup we'd use userparameters. https://www.zabbix.com/documentation/4.0/manual/config/items/userparameters – hargut Aug 12 '19 at 20:10
-
Eventually for long running scripts reading a log & executing the script with cron. – hargut Aug 12 '19 at 20:11