0

I have four tasks (t1, t2, t3, t4) that need to be run in sequence on an item (a URL) every 7 days. I use gearman to run these tasks and a cronjob to send the items to the gearman queue. Each task for an item has a date_run assigned to it. If date_run for t1 is less than 7 days from now that task is sent to the queue. If date_run for t2 is less than t1, that task is sent to the queue... and so on.

The problem I have is if t1 for an item has been queued but has not had time to finish before the cronjob kicks in again. Since the date_run is not updated until the task is complete it will look like the task hasn't been queued and I'll have duplicates of t1 for the same item in the queue.

The solutions I've thought of are:

  • Add an unique identifiers to each task and check if they've been queued already
  • Just check if the queue is empty or not and don't queue any more tasks until it is
  • Add a date_queued to the item table and use this instead of date_run on t1 to schedule the tasks every 7 days

I thought I'd check on stackoverflow first though, if there is a "best way" to solve this problem? I can't seem to get my head around it. :S

Thanks!

user1493124
  • 199
  • 1
  • 2
  • 9

2 Answers2

1

How about using a run only once task ? I would be using a simple singleton with locking instead of modifying my data setup.

C# example:

public class SingletonScheduledTask
{
    public void Run()
    {
        SingletonScheduledTask._SingletonScheduledTask.Instance.Handle();
    }

    private sealed class _SingletonScheduledTask
    {
        //multithreaded singleton with on system load instantiation.
        private static readonly _SingletonScheduledTask instance = new _SingletonScheduledTask();

        private static object syncHandle = new Object();
        private static bool handling = false;

        private _SingletonScheduledTask() { }

        public static _SingletonScheduledTask Instance
        {
            get
            {
                return instance;
            }
        }

        public void Handle()
        {
            if (!handling)
            {
                bool _handling = false;
                lock (syncHandle)
                {
                    _handling = handling;
                    handling = true;
                }
                _DoHandle(!_handling);
            }
        }
        private void _DoHandle(bool doHandle)
        {
            if (doHandle)
            {
                ///code here

                handling = false;
            }
        }
    }
}
Alin Mircea
  • 169
  • 1
  • 7
  • Thank you! Your solution gave me an idea, perhaps exactly what you meant. Gearman accepts a unique ID when adding the tasks. It will only add one task with the unique ID to the queue, so this solves my problem completely. – user1493124 Oct 15 '12 at 04:19
1

Just add a unique ID when adding the task to the queue. It doesn't matter how many times you add it, gearman will only register one.

user1493124
  • 199
  • 1
  • 2
  • 9