0

I'm currently developing a PM system for a website, in which users must be able to send PM's to multiple recipients. Naturally, this means that if a message has been sent to user A, B and C, user C could delete the message while user A and B won't. The question is what would be the best database table structure for such a system, of course avoiding multiple copies of one message. Currently I've thought of this table structure:

msgid (int),
parentid (int),
timestamp (timestamp),
senderid (int),
recipients (varchar),
subject (varchar)
text (text),
deletedby (varchar),
readby (varchar)

This would be the only table. Threads are created based on the parentid's (if there is no parentid the message is the first in a thread), and ordered by timestamp. Recipients are stored comma-separated in one column and checked by using WHERE userid IN (msg.recipients). The deletedby column contains all id's (comma-separated) of users which have deleted the message, just like the readby column.

However I am not sure if this is an ideal table structure. Before I start coding I would like to hear your thoughts for improvements.

Cœur
  • 37,241
  • 25
  • 195
  • 267
carlo
  • 700
  • 2
  • 9
  • 25

1 Answers1

4

Storing CSV values in a single field is invariably a bad design and will just cause you severe pain. Normalize the design now, before you roll out the system for production: put the recipients list into a child table, and put a "deleted" flat on the child record to indicate whether that particular recipient deleted the message or not:

recipientsTable

messageID    int -> foreign key to messages table
recipientID   int -> foreign key to users table
read     bit - t = read, f = unread
deleted   bit - t = deleted, f = still there.
readON    date - timestamp of when recipient read message

... or something similar.

Marc B
  • 356,200
  • 43
  • 426
  • 500
  • OK so in short, you suggest having two tables, one with the messages and one in which a row is inserted for each receipient, correct? – carlo Nov 07 '11 at 17:04
  • Yes. Having comma-separated values in a field is highly painful if you need to modify the list. Searching isn't so bad, as mysql has find_in_set(), but on other DB systems, it gets ugly fast. Removing a particular value from the csv is painful regardless of the underlying system. – Marc B Nov 07 '11 at 17:05
  • Allright, thank you. And what do you think about the thread system? Is the parentid and timestamp-ordering model in my example okay? – carlo Nov 07 '11 at 17:17
  • sounds serviceable. recursive structures get painful in sql when you start layering the children fairly deeply, though. – Marc B Nov 07 '11 at 17:20
  • True, could you think of an easier way to do this? – carlo Nov 07 '11 at 17:27
  • There's ways to do tree structures in SQL, but each has different benefits/drawbacks. I wouldn't worry about it too much, unless you're doing a "traverse the message history" operation frequently. For one or two levels, it's not a big deal. If you get lots of message threads going 200+ levels deep, THEN start worrying. – Marc B Nov 07 '11 at 17:30
  • Okay, I'm now thinking of making another table containing the replies, having a column mainid as foreign key to the 'main' messageid in the messages table. Would this be a decent solution? – carlo Nov 07 '11 at 17:37
  • Same problem. e.g message #300 is a reply to message #200 is a reply to message #100 is a reply to message #50, etc... Unless you want to disallow this kind of nesting, or put limits on it, eventually you'll get some kind of insanely nested conversation. – Marc B Nov 07 '11 at 17:40