15

Question

I'm trying to write a simple employee Scheduling software for about 10-20 people in my software development company. After some consideration I settled on writing a web app in Python, Ruby or PHP + Postgres/MySQL DB. While designing database models I began to wonder what data structure would actually be the best for that kind of application.

What it will look like

Example of app showing the month view would be similar to this:

 OCTOBER    1 2 3 4 5 6 7 8 9 ...
John Apple  M M A A N N O O O ...
Daisy Pear  O O O M M A A N N ...
Steve Cat   A A N N O O O M M ...
Maria Dog   N N O O O M M A A ...

where M -> for Morning shift; A -> Afternoon shift etc. (letters can be changed to codes)

What data structure or database design would be the best for this? I was thinking about storing strings (max of 31 characters -> 1 char , 1 day) similar to -> "MMAANNOOOAAMMNNAAOO..." for each user; Month table would contain such strings for each employee.

What would you suggest?

skaffman
  • 398,947
  • 96
  • 818
  • 769
M_1
  • 2,135
  • 4
  • 21
  • 24

3 Answers3

17

I would go with three-table Kimball star (Date, Employee, Schedule), because sooner or later you will be asked to create (demanding) reports out of this. Who worked most nights? Who worked most weekends? Who never works weekends? Why am I always scheduled Friday afternoon? On which day of a week are certain employees most likely not to show up? Etc, etc...

Tables would be:

TABLE dimDate (
    KeyDate
  , FullDate
  , DayOfWeek
  , DayNumberInWeek
  , IsHoliday
  ,... more here
)

You can pre-fill dimDate table for 10 years, or so -- may need to tweak the "IsHoliday" column from time to time.

Employee table also changes (relatively) rarely.

TABLE dimEmployee (
    KeyEmployee
  , FirstName
  , LastName
  , Age
  , ... more here
)

Schedule table is where you would fill-in the work schedule, I have also suggested "HoursOfWork" for each shift, this way it is easy to aggregate hours in reports, like: "How many hours did John Doe work last year on holidays?"

TABLE
factSchedule (
    KeySchedule  -- surrogate PK
  , KeyDate      -- FK to dimDate table
  , KeyEmployee  -- FK to dimEmployee table
  , Shift        -- shift number (degenerate dimension)
  , HoursOfWork  -- number of work hours in that shift
)

Instead of having the surrogate KeySchedule, you could also combine KeyDate, KeyEmployee and Shift into a composite primary key to make sure you can not schedule same person on the same shift the same day. Check this on the application layer if the surrogate key is used. When querying, join tables like:

SELECT SUM(s.HoursOfWork)
 FROM factSchedule AS s
 JOIN dimDate      AS d ON s.KeyDate = d.KeyDate
 JOIN dimEmployee  AS e ON s.KeyEmployee = e.KeyEmployee
WHERE e.FirstName='John'
  AND e.LastName='Doe'
  AND d.Year = 2009
  AND d.IsHoliday ='Yes';

If using MySQL it is OK to use MyISAM for storage engine and implement your foreign keys (FK) as "logical only" -- use the application layer to take care of referential integrity.

Hope this helps.


empschd_model_01

Damir Sudarevic
  • 21,891
  • 3
  • 47
  • 71
  • Thanks, trying to implement something similar. – M_1 Oct 29 '09 at 21:45
  • Thank you very much for the advise, I started writing webapp in django basing on your design. It's very helpful. Thanks a lot again :) – M_1 Nov 01 '09 at 23:11
3

A quick answer first:

  • EmployeeID
  • Date
  • ShiftType

That said, the best database design largely depends on what you're going to do with the data. If all you need to do is store the records and display them in a table similar to you example, your approach (while not elegant) would work.

However, if you're going to retrieve the data or run reports, you're going to want something a little more structured than a string where each character represents the type of shift assignment.

Larsenal
  • 49,878
  • 43
  • 152
  • 220
  • I considered doing a "normal" database design that comes to mind right away, but wouldn't it be an overkill for something that should be that simple? I may want to do a report or two, but not that much. – M_1 Oct 27 '09 at 23:35
  • @Muzzyn1 - Using your design, how much effort would you have to go to to determine who was working last Tuesday afternoon? Or who is scheduled for next Friday morning? This database should be so simple to normalise that it is definitely not 'overkill'. – Kirk Broadhurst Oct 27 '09 at 23:45
  • Sorry, I am not exactly a database specialist. Low level programming is something that is closer to me (maybe that is why I'm trying to use the concepts from low level software design? ). This project is mainly to teach myself some web app programming and make mine and others life a bit easier, when it comes to shifts :) Is there anything that could be taken into consideartion apart from presented normalized database design? – M_1 Oct 27 '09 at 23:54
  • 1
    A good gut check is to write out some sample queries for known scenarios. Kirk's examples are a good starting point. If it's just a small one-off project and you can easily write queries against your data model, then go with it. The downside of a normalized approach is that it's (arguably) not (as) easy to pull up a row of the database to get a snapshot of a person's schedule. If you wanted to manually edit a person's schedule by manipulating the raw data, your approach may be easier. However, if you want to address individual assignments, a normalized approach may be better. – Larsenal Oct 28 '09 at 00:06
1

I'd suggest a more noramlized database, e.g. a table for persons and one which is the product of shift information for a perdon and a date.

Lucero
  • 59,176
  • 9
  • 122
  • 152