12

I have a query that gets all the info I need for a messaging system's main page (including unread message count, etc)... but it currently retrieves the original threads message. I would like to augment the below query to grab the most recent message in each thread instead.

This query is very close, however my mediocre SQL skills are keeping me from wrapping things up...

$messages = array();
$unread_messages_total = 0;

$messages_query = "
SELECT m.*
    , COUNT(r.id) AS num_replies
    , MAX(r.datetime) AS reply_datetime
    , (m.archived NOT LIKE '%,".$cms_user['id'].",%') AS message_archive
    , (m.viewed LIKE '%,".$cms_user['id'].",%') AS message_viewed 
    , SUM(r.viewed NOT LIKE '%,".$cms_user['id'].",%') AS unread_replies 
    , CASE
        WHEN MAX(r.datetime) >= m.datetime THEN MAX(r.datetime)
        ELSE m.datetime
        END AS last_datetime
FROM directus_messages AS m
LEFT JOIN directus_messages as r ON m.id = r.reply
WHERE m.active = '1'  
AND (m.to LIKE '%,".$cms_user['id'].",%' OR m.to = 'all' OR m.from = '".$cms_user['id']."') 
GROUP BY m.id
HAVING m.reply = '0' 
ORDER BY last_datetime DESC";

foreach($dbh->query($messages_query) as $row_messages){
    $messages[] = $row_messages;
    $unread_messages_total += (strpos($row_messages['archived'], ','.$cms_user['id'].',') === false && ( (strpos($row_messages['viewed'], ','.$cms_user['id'].',') === false && $row_messages['unread_replies'] == NULL) || ($row_messages['unread_replies']>0 && $row_messages['unread_replies'] != NULL) ) )? 1 : 0;
}

Thanks in advance for any help you can provide!

EDIT: (Database)

CREATE TABLE `cms_messages` (
  `id` int(10) NOT NULL auto_increment,
  `active` tinyint(1) NOT NULL default '1',
  `subject` varchar(255) NOT NULL default '',
  `message` text NOT NULL,
  `datetime` datetime NOT NULL default '0000-00-00 00:00:00',
  `reply` int(10) NOT NULL default '0',
  `from` int(10) NOT NULL default '0',
  `to` varchar(255) NOT NULL default '',
  `viewed` varchar(255) NOT NULL default ',',
  `archived` varchar(255) NOT NULL default ',',
  PRIMARY KEY  (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;

EDIT 2: (Requirements)

  • Return all parent messages for a specific user_id: $cms_user['id']
  • Return the number of replies for that parent message: num_replies
  • Return the number of unread replies for that parent message: unread_replies
  • Return the date of the parent message or it's most recent reply: last_datetime
  • Return whether the message is in the archive: message_archive
  • Return whether the message has been viewed: message_viewed
  • Return all messages in DESC datetime order
  • Return the newest message, from the parent or replies if there are some (like gmail)
RANGER
  • 1,643
  • 2
  • 17
  • 31
  • Can you attach the table schema? and the logics of how to identify the recent message. thanx – yoavmatchulsky Jun 29 '11 at 06:35
  • @yoavmatchulsky - Schema added. Rows are either parent messages (`reply=0`) or replies (`reply=parent_id`). I would like to have the parent message if there are no replies, or the most recent reply otherwise (latest reply for that parent message id ordered by datetime DESC). Does that make sense? – RANGER Jun 29 '11 at 06:46
  • Do you have deep nesting messages? or just one level? – yoavmatchulsky Jun 29 '11 at 06:59
  • If one level - this can be easily done with another join. If deep nesting -then you will have to programatically find this out. (or have another table/column that stores the main thread and the latest reply id. – Mez Jun 29 '11 at 14:57
  • Problem with deep nesting could be solved by switching to another DBMS that supports recursive `select` queries (e.g., PostgreSQL), if it is not too late. – binaryLV Jun 29 '11 at 15:04
  • @yoavmatchulsky @Mez @binaryLV - There is only one level of messages, no deep nesting (replies to replies). Thanks! – RANGER Jun 29 '11 at 18:09
  • @dqhendricks - Yes, I'm using a foreach loop over the query and it returned all the replies as separate messages. – RANGER Jul 06 '11 at 04:42
  • (1) Is `directus_messages` the same as `cms_messages`? (2) `From` seems to be ID of author; what is `to` field for? – binaryLV Jul 06 '11 at 06:43
  • @cbh if you are grouping by m.id, how is it returning each reply as a separate row? doesn't make sence. I would like to see the code. – dqhendricks Jul 06 '11 at 17:38
  • @binaryLV - yes, `directus_messages` is the same as `cms_messages` ... `from` is the `id` of the author, while `to` is a csv of user `id`s – RANGER Jul 06 '11 at 17:55

3 Answers3

5

If you have only 2 levels of messages (i.e., only parent messages and direct answers), you might try this query:

select
    root_message.id,
    root_message.active,
    root_message.subject,
    case
        when max_reply_id.max_id is null then 
            root_message.message
        else
            reply_message.message
    end as message,
    root_message.datetime,
    root_message.reply,
    root_message.from,
    root_message.to,
    root_message.viewed,
    root_message.archived
from
    -- basic data
    cms_messages as root_message
    -- ID of last reply for every root message
    left join (
        select 
            max(id) as max_id, 
            reply as parent_id 
        from 
            cms_messages
        where
            reply <> 0 
        group by 
            reply
    ) as max_reply_id on max_reply_id.parent_id = root_message.id                                              
    left join cms_messages as reply_message on reply_message.id = max_reply_id.max_id
where
    root_message.reply = 0

It uses subquery max_reply_id as source of data to select ID of the latest answer. If it exists (i.e., if there are answers), reply_message.message is used. If it does not exist (no answer has been found for root message), then root_message.message is used.

You should also think about structure of table. E.g., it would make more sense if reply contained either NULL, if it is parent message, or ID of existing message. Currently, you set it to 0 (ID of non-existent message), which is wrong. Types of viewed and archived are also weird.

Edit: you should also avoid using having clause. Use where instead, when possible.


Here's a new query that should fulfil your requirements. If there is any problem with it (i.e., if it returns wrong data), let me know.

Like the first query, it:

  • uses subquery reply_summary to accumulate data about replies (ID of last reply, number of replies and number of unread replies);
  • joins this subquery to the base table;
  • joins cms_messages as reply_message to the subquery, based on reply_summary.max_reply_id, to get data about the last reply (message, datetime).

I've simplified the way how you determine last_datetime - it now takes either time of last reply (if there is any reply), or time of original post (when no replies are found).

I have not filtered replies by from and to fields. If it is necessary, where clause of reply_summary subquery should be updated.

select
    parent_message.id,
    parent_message.subject,
    parent_message.message,
    parent_message.from,
    parent_message.to,
    coalesce(reply_summary.num_replies, 0) as num_replies,
    last_reply_message.datetime as reply_datetime,
    (parent_message.archived NOT LIKE '%,{$cms_user['id']},%') AS message_archive,
    (parent_message.viewed   LIKE     '%,{$cms_user['id']},%') AS message_viewed,
    reply_summary.unread_replies,
    coalesce(last_reply_message.message, parent_message.message) as last_message,
    coalesce(last_reply_message.datetime, parent_message.datetime) as last_datetime
from
    cms_messages as parent_message
    left join (
        select
            reply as parent_id,
            max(id) as last_reply_id,
            count(*) as num_replies,
            sum(viewed not like '%,{$cms_user['id']},%') as unread_replies
        from
            cms_messages
        where
            reply <> 0 and
            active = 1
        group by
            reply
    ) as reply_summary on reply_summary.parent_id = parent_message.id
    left join cms_messages as last_reply_message on last_reply_message.id = reply_summary.last_reply_id
where
    parent_message.reply = 0 and
    parent_message.active = 1 and
    (parent_message.to like '%,{$cms_user['id']},%' or parent_message.to = 'all' or parent_message.from = '{$cms_user['id']}')
order by
    last_datetime desc;
binaryLV
  • 9,002
  • 2
  • 40
  • 42
  • I can change the reply to allow `NULL` instead of `0`, but your query doesn't seem to get messages specific to a `user_id` as the original does. It also doesn't calculate `num_replies` for each parent item... I have added the initial requirements to the post. – RANGER Jun 29 '11 at 18:24
  • Your initial question was about getting "most recent message" instead of original message. I gave you perfectly working(?) solution for that. To filter messages of specific user, just update `where` clause. To get number of replies, you can include `select(*) as number_of_replies` in `max_reply_id` subquery (and in the list of values to be selected in main query, of course). – binaryLV Jul 04 '11 at 09:42
  • 1
    @cbh, if you need a help, ask for it. If you need a 100% complete solution - pay for it. We do not have loads of time to give you 100% solutions, though we can give hints and we can help to solve the problems, but you have to participate in that too instead of just giving requirements. – binaryLV Jul 04 '11 at 09:45
  • If my wording was off in the question I apologize. I edited the question to include very specific requirements a while ago. I'm not looking for someone to solve this problem for me, the code I provided in the question fulfills all but one of the requirements. All I need is help adapting that sql to grab the newest `message` instead of the first. You provided an almost entirely new SQL query that fulfilled the last requirement without any of the originals. I understand your answer and have written something similar, the true nature of the question was to combine the two queries – RANGER Jul 05 '11 at 21:25
  • Those requirements were added after I provided answer to the initial question. Anyway, I've updated my answer with new query. Let me know if it does not work as expected. – binaryLV Jul 06 '11 at 07:47
  • Thank you! This is exactly what I was trying to figure out. It still doesn't return a `reply_message` though... even when there is a `num_replies` greater than 0. Any thoughts? – RANGER Jul 06 '11 at 18:20
  • Got it to work, just needed `coalesce(last_reply_message.message, parent_message.message) as last_message`. Thank you so much for your help! Enjoy the +50! – RANGER Jul 06 '11 at 20:03
  • Yes, I forgot to add the `last_message`. If you need "either last message, or parent message", then use `coalesce` on both columns as you did. If you just need parent message in one column and last reply message, when it exists, in another column, select just `parent_message.message as parent_message, last_reply_message.message as reply_message`. In this case, when there are no replies, `reply_message` will be `NULL`, otherwise it should always be filled. – binaryLV Jul 07 '11 at 06:29
3

your problem is that you are fetching only m records no matter what the order of the r records.

try adding

SELECT m.*, r.*

or

SELECT r.*, m.*

if you are using PDO::FETCH_ASSOC as your PDO fetch mode (assuming you are using PDO to access your database), the result will be an associative array where if the result set contains multiple columns with the same name, PDO::FETCH_ASSOC returns only a single value per column name. not sure which order takes presidence, so you would have to try both.

if your columns are defined in the right order, they will return the r.* value if one exists, or the m.* value if no r records exist. does this make sense? this way your result set will contain the latest record no matter which table (m or r) contains them.

http://www.php.net/manual/en/pdo.constants.php

dqhendricks
  • 19,030
  • 11
  • 50
  • 83
2

I am afraid that you wont be able to solve this problem with a single query. Either you have to use more queries and gather the informations in the surrounding code or you will have to redesign the database structure for your messaging system a litte (tables: threads, posts, etc.). If you decide to redesign the database structure, you should also take care of the way you handle the viewed and archived fields. The way you use the fields (varchar 255 only!) might work for some users, but as soon as there are more users and higher user IDs your message system will break down.

Marco
  • 960
  • 2
  • 7
  • 26
  • Thanks for the response. This is for a closed system with a limited number of users so the 255 limit will not be an issue. – RANGER Jun 29 '11 at 18:12