I am about to code something and thought I'd put my idea out here first to see if anyone has comments, etc.
I want to create a class in python for simultaneously monitoring (and merging) multiple log files that will be used as part of an automated test. I am hoping to simply execute 'tail -f' (over SSH with paramiko, maybe) on each file using a separate thread. Then, every few seconds, get the stdout from each thread and merge it into one file with a suffix added to each line that identifies the source. This way I can write tests of distributed systems and monitor the logs of maybe a dozen machines at once (numerous of which have the same purpose and are behind a load balancer, etc)
Startup:
for machine, logfile in config_list:
create thread running tail -f on logfile on machine
create accumulator thread that:
wakes up each second and
gets all config_list stdout and merges it into one in-memory list
Test_API:
method to get/query data from the in memory accumulator.
in memory list would be the only data item needed to be synchronized
So, I am wondering: is paramiko the correct choice? any caveats, etc about handling the threading (have never done anything with threading in python)? any additional ideas that come to mind?
thanks in advance!
feel free to post code snippets. i will update this post with a working solution once it is finished. i anticipate it will be pretty small
Just found this: Creating multiple SSH connections at a time using Paramiko
EDIT
From looking at a couple other posts, I have this so far. It is just doing a tail, not a tail -f and does not have the polling I need.
from someplace import TestLogger
import threading
import paramiko
def start_watching():
logger = TestLogger().get()
logs_to_watch = [('somemachine1', '/var/log/foo'),
('somemachine2', '/var/log/bar')]
threads = []
for machine, filename in logs_to_watch:
logger.info(machine)
logger.info(filename)
t = threading.Thread(target=workon, args=(machine, filename))
t.start()
threads.append(t)
for t in threads:
t.join()
for merge_line in merged_log:
logger.info(merge_line.dump())
outlock = threading.Lock()
merged_log = []
def workon(host, logfile):
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(host, username='yourusername', allow_agent=True, look_for_keys=True)
stdin, stdout, stderr = ssh.exec_command('sudo tail ' + logfile)
stdin.flush()
with outlock:
line = stdout.readline()
while line:
line = stdout.readline()
merged_log.append(MergeLogLine(line, host, logfile))
class MergeLogLine():
def __init__(self, line, host, logfile):
self._line = line
self._host = host
self._logfile = logfile
def line(self):
return self._line
def host(self):
return self._host
def logfile(self):
return self._logfile
def dump(self):
return self._line + '(from host = ' + self._host + ', log = ' + self._logfile + ')'