3

I am trying to retrieve data from three tables at once. The tables look like:

categories

id
category
messageid

messages

id
title
message

comments

id
messageid
message

What I am trying to get is 1 message (because I have a WHERE clause based on id 3), 3 categories (because there are 3 categories linked to the message) and 2 comments (because there are 2 comments linked to the message).

I'm trying to retrieve the data by usnig the following query:

SELECT categories.category, messages.id, messages.title, messages.message, comments.count, comments.id as commentid, comments.message as commentmessage
FROM categories
RIGHT JOIN 
  (SELECT id, title, message
  FROM messages WHERE messaged.id = 3) messages
ON messages.id = categories.messageid
LEFT JOIN
  (SELECT count(id), id, message
  FROM comments
  GROUP BY id, message) comments
ON messages.id = comments.messageid
ORDER BY article.id DESC

However when running this query I get 6 results:

category      id  title      message      count  commentid  commentmessage
test          3   the title  the message  1      6          comment 1
test          3   the title  the message  1      5          comment 2
installation  3   the title  the message  1      6          comment 1
installation  3   the title  the message  1      5          comment 2
question      3   the title  the message  1      6          comment 1
question      3   the title  the message  1      5          comment 2

Where I expected a result something like:

category      id  title      message      count  commentid  commentmessage
test          3   the title  the message  1      6          comment 1
question      3   the title  the message  1      5          comment 2
installation  3   the title  the message  1      null       null

With only three rows I should be able to get all the needed data. Is this even possible? Am I doing it completely wrong ™?

user6669
  • 155
  • 1
  • 2
  • 10
  • What database? Also, does the posted query actually run? Your group by doesn't include message; I would normally expect that to fail. – Bert Jul 25 '12 at 20:48
  • As far as I can see, one category got associated 1 message, and this message got associated 2 comments. So, if you got 3 categories, you will got 6 comments, two for categories (two for message). If you want something like your expected output, you should give us more information, and decide which commentMesagge choose. – Gonzalo.- Jul 25 '12 at 20:52
  • @BertEvans Yup. That would fail. Sorry copy/paste error. Fixed now. Database is Postgresql. – user6669 Jul 25 '12 at 20:55
  • @ElVieejo What information do you need more? What do you mean by "and decide which commentMesagge choose"? – user6669 Jul 25 '12 at 20:57
  • 1
    You're doing it wrong. You can't retrieve results from two different tables at once row-by-row in order. When you join tables, all possible combinations that match are always included. What you really need to do is pull the categories separately from pulling the messages with two individual queries. – mellamokb Jul 25 '12 at 20:58
  • you're getting the first result. Now, you said that you want the second result. In test category, for example, at the end shows the comment1, and not comment 2 (in the first result, you can see that both comments are related to this category). So, how do you decide that you want comment1 ? I mean, what logic do you use ? – Gonzalo.- Jul 25 '12 at 20:59
  • 1
    As @mellamokb said, you wouldn't typically retrieve the data in this fashion. With one to many relationships you would get the parent data (the message) and then retrieve the categories and comments separately. That said, it could be done; you are just duplicating data. Is it a requirement to retrieve all the information in as few rows as possible? – Bert Jul 25 '12 at 21:05
  • @BertEvans Nope. It isn't a requirement at all. I just thought it would be possible. But given your guys comments I just found out that: I'm doing it wrong ™ :-) Feel free to answer stating this if you want some easy rep. – user6669 Jul 25 '12 at 21:13
  • @user6669 Also, which database? – Bert Jul 25 '12 at 21:20
  • @user6669 OK, well the trick I show in the answer for collapsing categories is not likely to work in postgresql, but the rest should apply. – Bert Jul 25 '12 at 21:59

1 Answers1

8

There are a couple issues here as mentioned in the comments.

First, since you are joining three tables, the answer you are getting is correct. 1 x 2 x 3 rows = 6.

Second, your aggregate of the comments is not really aggregating anything. As you can see in your results, count is always one, when I expect you thought it would be 2 for two comments. Since you are grouping on the id, the count is performed for each unique id, which will always be one. I think you probably want to group on messageid

SELECT count(*), messageid
FROM comments
GROUP BY messageid

You would need to do another join or a separate query to get the comments themselves.

Also as discussed in the comments, you wouldn't typically get the information this way; you would usually just make the three queries since two of the relationships are one to many. You could, if your categories were short (and you're using SQL Server) compress the categories into their own column (ie. 'test, installation, question'). Here is how you would do that.

select id, title, message,
       (select CAST(category + ', ' as nvarchar(max))
        from @Categories c where messageid = m.id
        for xml path('')) as Categories
from @Messages m
where m.id = 3

Actually, there are several methods for doing that, but that's quick and dirty. Then you would only need one additional query for the comments. You could just join to the previous query and get all your information in two rows like this

select m.id, title, m.message,
       (select CAST(category + ', ' as nvarchar(max))
        from @Categories c where messageid = m.id
        for xml path('')) as Categories,
        cm.message
from @Messages m
left outer join @Comments cm on m.id = cm.messageid
where m.id = 3

but again, you'd probably just want to make an additional query to avoid duplicating information.

Finally, I wanted to show how you would probably want to do the comment count.

select m.id, title, m.message,
       (select CAST(category + ', ' as nvarchar(max))
        from @Categories c where messageid = m.id
        for xml path('')) as Categories,
        CommentCount,
        cm.message
from @Messages m
left outer join 
(   
    select messageid, COUNT(*) CommentCount
    from @Comments 
    group by messageid
) rsCommentCount on rsCommentCount.messageid = m.id

And lastly, here is a link showing that working.

Community
  • 1
  • 1
Bert
  • 80,741
  • 17
  • 199
  • 164
  • 1
    +1 Excellent answer. I might suggest if you want to demo SQL queries, try http://sqlfiddle.com, which is made for that purpose and actually supports Postgre. SO queries may or may not stay around forever. – mellamokb Jul 26 '12 at 12:59