0

I have a script which will send email if the task is not in running status. This script runs every 5 min.

for task_list in status['tasks']:
        if task_list['state'] != 'RUNNING':
            alert_email(connector+'-task-'+str(task_list['id'])+'-status-'+task_list['state'])

What I am looking for is to avoid flooding email. Because something is failed may not be fixed in next 1 hour or so. Want to send only 1 email instead of 12 emails. I would appreciate any idea.

I was thinking of writing the alert in the sent directory before emailing the alert. next time the alert will be sent only if md5sum doesn't match with the alert written in sent directory or sent directory is empty.

oraclept
  • 29
  • 1
  • 5
  • What condition, exactly, are you trying to avoid? Sending 12 mails because 12 different tasks are in a fail state at that moment? Or sending 12 mails over the course of an hour because one task is always in a fail state? – John Gordon Jan 29 '23 at 23:58
  • Out of 8 tasks, Ex: Task number 6 is down then an email will be sent that Task number 6 is down. The script runs again and determines Task number 6 is down and another email will be sent that Task number6 is down. Want to avoid repetitive emailing that Task number 6 is down. If the issued is fixed 60 min after the initial occurance there will be 12 emails sent already. Want to find out a way to avoid repetitive emailing. – oraclept Jan 30 '23 at 01:49
  • The script will need to save the results from the current run (possibly in a plain file or a database), and then on the next run, look at those results to decide whether to send mail. – John Gordon Jan 30 '23 at 02:25

1 Answers1

1

Use a Boolean flag!

alert_sent = False
for task_list in status['tasks']:
    if task_list['state'] != 'RUNNING' and not alert_sent:
        alert_email(connector+'-task-'+str(task_list['id'])+'-status-'+task_list['state'])
        alert_sent = True

The alert_sent flag is initially set to False. When an alert email is sent, the flag is set to True, so that no further alerts will be sent for the rest of the task list.

UPDATE:

import os

sent_dir = 'sent'
if not os.path.exists(sent_dir):
    os.makedirs(sent_dir)

for task_list in status['tasks']:
    task_id = str(task_list['id'])
    state = task_list['state']
    filename = os.path.join(sent_dir, f"{task_id}-{state}.txt")
    if state != 'RUNNING' and not os.path.exists(filename):
        alert_email(connector+'-task-'+task_id+'-status-'+state)
        with open(filename, 'w') as f:
            f.write("Sent alert for task %s in state %s" % (task_id, state))

Now the script will check if a directory named sent exists. If it doesn't, the directory is created. Then, for each task in the list, the script generates a file name based on the task ID and state. If the task is not in a running state and the corresponding file doesn't exist, the alert is sent and a file is created to store the state of the alert.

If the task state changes (e.g., from "FAILED" to "RUNNING"), the corresponding file will be deleted and the next time the script runs, it will send another alert if the state is not "RUNNING".

UPDATE #2:

import time

alert_sent_time = 0
time_threshold = 300 # 5 minutes in seconds
for task_list in status['tasks']:
    if task_list['state'] != 'RUNNING' and time.time() - alert_sent_time > time_threshold:
        alert_email(connector+'-task-'+str(task_list['id'])+'-status-'+task_list['state'])
        alert_sent_time = time.time()

Now it will be reset automatically after 5 minutes (as you mentioned in your post)

Nova
  • 406
  • 2
  • 13
  • I like the idea of flag but this means we have to reset it manually. If someone forgets to reset the flag after fixing the issue. we wont have alerting. – oraclept Jan 30 '23 at 01:58
  • @oraclept check the updated code then ^_^ – Nova Jan 30 '23 at 06:39
  • Thank you @nova for the effort. Here is what I am planning to do 1. Script runs every 5 min via cron (Doesn't run in while loop) 2. An email will be sent, and alert would be written to a single file 3. If issue is not fixed. script would check alert file presence and will not send email. 4. Issue is fixed it would go ahead and delete the alert file. 5. Next time again issue occurs step2 will trigger – oraclept Feb 01 '23 at 03:19