Database design for task scheduling

Question

I have tasks which need to be repeatedly executed in specified hours and days. For example, task A should be executed at 1, 2 and 3 p.m. on Monday and at 2, 8, 11 p.m. on Friday. Task B should be executed at 4 a.m. on Monday and so on. Basically every task could be executed at any hour of any day. If all tasks for current hour have been executed we execute them all again. So it's like a priority queue for every hour.

I'm using Postgres and right now it's JSONB facilities. I have table tasks which contain task related info and schedule. Schedule for every task looks something like this:

{
  "Mo": [
    1,
    2,
    3
  ],
  "Fr": [
    4,
    5,
    6
  ]
}

But I don't like that json is not strongly typed, query which selects tasks looks really ugly and I think performance of query could be better.

I thought may be I create table where I store each hour of week and, for example, hours and table executions where I store id of tasks correlated to id of hours. Could work, but it seems very bad for storage space(amount of tasks is about 100000 right now and growing).

So I wonder how to design database properly for such case?

You don't need to store a table full of dates/times. You can use a numbers/[calendar table](http://stackoverflow.com/questions/5821915/calendar-tables-in-postgresql-9) to populate a period of time as part of your query, then join a table containing the events. — JohnHC, Jan 17 '17 at 11:04
@JohnHC thank you, but could you please explain your idea more? I'm little bit lost. Also I don't really need dates, just weekday and hour. — Artem Malinko, Jan 17 '17 at 11:17

JohnHC · Answer 1 · 2017-01-17T11:48:12.643

0

Ok, you need to store the tasks, but not a full calendar (create the calendar on the fly when extracting data)...

For each event/task, you will need to store a start datetime and either a duration or an end datetime.

create table TASKS (ID integer constraint TASK_ID primary key,
                    TASK_NAME varchar(200),
                    REPEAT_DAY int not null,
                    REPEAT_TIME time not null);

to select the data, use a CTE to populate a calendar on the fly

with TODAY_CAL as
(
select current_date + (n || ' hour')::INTERVAL as CalTime -- use an appropriate date here, adjust for granularity (hour/minute/second)
from generate_series(0, 23) n 
)
select c1.CalTime, t2.TaskName
from TODAY_CAL c1
left join TASKS t2
  on datepart('DOW', c1.CalTime) = t2.Repeat_Day
  and datepart('Hour', c1.CalTime) = t2.Repeate_Time

edited Jan 17 '17 at 11:48

answered Jan 17 '17 at 11:30

JohnHC

10,935
1
24
40

Thank you for your answer. But how do I store task which needs to be executed, for example, at 1 p.m. on Monday and at 4 a.m. on Sunday? Should it be two different tasks then? Also I'm not sure if need dates because it should be every week, not just particular week. – Artem Malinko Jan 17 '17 at 11:38
@ArtemMalinko I see... Let me redo – JohnHC Jan 17 '17 at 11:41

Database design for task scheduling

1 Answers1