0

I am working on a forum system (mysql) and I'm not sure which path to choose for better performance when retrieving in a single query posts, up and down votes and if the current user voted for each post. The first option is this:

SELECT posts.post_id, post_content, display_name, 
(SELECT COUNT(post_id) FROM post_votes WHERE post_votes.post_id=posts.post_id AND post_votes.user_id='+user_id+') voted,
(SELECT COUNT(post_id) FROM post_votes WHERE post_votes.post_id=posts.post_id AND up_vote=1) upvotes, 
(SELECT COUNT(post_id) FROM post_votes WHERE post_votes.post_id=posts.post_id AND up_vote=0) downvotes
FROM posts JOIN users ON users.user_id=posts.user_id WHERE parent_id ='+parent_id+' ORDER BY post_id DESC

The second option is to replace all the count sub-queries with LEFT JOIN and count. Are there any advantages to one method over the other?

Edit:

Since I'm looking to retrieve all posts rather than a single row that groups posts, I came up with this query (with some inspiration from here):

SELECT p.post_id, post_content, display_name,
COALESCE(v.upvotes, 0) AS upvotes, 
COALESCE(v.downvotes, 0) AS downvotes, 
COALESCE(v.voted, 0) AS voted 
FROM posts p 
LEFT JOIN ( 
        SELECT post_id, 
        SUM(vt.up_vote = 1) AS upvotes, 
        SUM(vt.up_vote = 0) AS downvotes, 
        MAX(IF(vt.user_id = ' + user_id + ', vt.up_vote, NULL)) voted 
        FROM post_votes vt 
        GROUP BY vt.post_id
        ) 
        v ON v.post_id = p.post_id 
JOIN users ON users.user_id=p.user_id 
WHERE parent_id =' + parent_id + ' ORDER BY post_id DESC

I have ran both solutions on my demo db (tiny at the moment, contains less than 100 rows in each table) and the durations were identical. The question is which one will be faster for the long term.

Community
  • 1
  • 1
Omer Greenwald
  • 758
  • 7
  • 20

1 Answers1

1

I can hardly think of anything where a subquery was faster than a join.

In this case you don't even need a join. Do it all in one query:

SELECT 
p.post_id, 
p.post_content, 
u.display_name, 
COUNT(pv.post_id) AS voted,
SUM(pv.up_vote = 1) AS upvotes,
SUM(pv.up_vote = 0) downvotes
FROM posts p
JOIN users u ON u.user_id = p.user_id 
LEFT JOIN post_votes pv ON posts.post_id = pv.post_id AND pv.user_id ='whatever'
WHERE p.parent_id ='+parent_id+' 
GROUP BY p.post_id
ORDER BY p.post_id DESC

The pv.up_vote = 'whatever' inside the SUM() function returns either true or false, 1 or 0. That's why we use the SUM() function here. And voila, everything in one query.

fancyPants
  • 50,732
  • 33
  • 89
  • 96
  • Thanks. please explain how does the count(*) calculates if the user voted or not without adding a subquery? by the way, the down votes should be calculated separately and not be subtracted from the upvotes. – Omer Greenwald Jan 29 '15 at 19:43
  • `COUNT(*)` is just the total number of rows. If you want to count only the rows in post_vote table specify a column name from that table. Fixed that in my answer. Your second comment I don't understand. I'm not subtracting anything in my answer. – fancyPants Jan 29 '15 at 22:43
  • ok, I'll try it. however I'm still not sure how the counter relates specifically to the logged in user and not to all the votes. second issue is clear now – Omer Greenwald Jan 30 '15 at 08:22
  • Well, I was assuming, that every row is a vote. You give no information whatsoever about your data, all I see is a query. Think about it. It's always a good idea to include sample data and desired result in a question and a small explanation about the data. – fancyPants Jan 30 '15 at 08:26
  • After checking your solution and finding the output is a single row, I realize I probably didn't explain the requirements well enough. Your solution works well for counting votes of posts under a specific parent id, but what I need is actually multiple rows, one for each of the posts. – Omer Greenwald Feb 04 '15 at 18:38
  • Your suggestion seem to work. However, I noticed that you feel a strong urge to condescend on others when you think they were mistaken. You did it with @Rahul in the question comments and here you do it again, more subtle but still uncalled for. I understand it makes you feel (mistakenly) smarter than others, but if you make it a habit to be more humble and respectful to others, it might actually improve your life beyond stackoverflow. – Omer Greenwald Feb 05 '15 at 07:05
  • I'm very sorry you took personally the fact I unaccepted your answer, I simply wanted to get more answers to my question, one of which, "god forbid", might be better than yours. If you weren't self centered, you might have noticed I upvoted your answer. Next time if you think a question is written poorly just don't answer and save everyone your arrogant comments such as "you question is poor" or "you have no clue on the subject". And don't call ME rude after such comments. – Omer Greenwald Feb 05 '15 at 09:03
  • You just don't get it, do you? Improve your question! Make it clear, what you're after. Sample data, desired result, include it all. You said that the addition of `GROUP BY` seems to work. Does it work or not? If not, why? If you had added this information, that you need this at all in the first place, we could have spared ourselfes this discussion. So please, spare me everything else. I'm deleting my comments, feel free to do the same. – fancyPants Feb 05 '15 at 09:42