1

I have a database for a chat application.

CREATE TABLE Users (uid int PRIMARY KEY, name text, phone text );

CREATE TABLE Messages (recipient int REFERENCES Users(uid), sender int 
REFERENCES Users(uid), time timestamp NOT NULL, message text NOT NULL, 
PRIMARY KEY (recipient, sender, time));

http://www.sqlfiddle.com/#!9/bd36d1

I want to define, for each of the 5 users which have sent the most messages, the average length of messages that have been sent by this user.

I have written the following query:

SELECT avg(strlen(message))
FROM Messages
WHERE sender IN
                (SELECT *
                 FROM (SELECT sender, COUNT(sender) AS NumberOfMessages
                       FROM Messages
                       GROUP BY sender) AS MessagesPerSender
                 ORDER BY NumberOfMessages DESC
                 LIMIT 5)

To start with, is this query correct? Does it give me the desired result? The problem is I can't run it at all cause I get the error:

"This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery"

TheWildHealer
  • 1,546
  • 1
  • 15
  • 26
MJ13
  • 195
  • 6

3 Answers3

2

Not the right approach for mysql this may do

select sender,avg(length(message)),count(*)
from   messages
group by sender
order by avg(length(message)) desc limit 5;

+--------+----------------------+----------+
| sender | avg(length(message)) | count(*) |
+--------+----------------------+----------+
|      1 |               9.0000 |        1 |
|      9 |               5.5000 |        2 |
|      2 |               5.0000 |        1 |
+--------+----------------------+----------+
3 rows in set (0.00 sec)

Note this may not deal with draws in the way you want.

P.Salmon
  • 17,104
  • 2
  • 12
  • 19
2

You had 2 errors in your code:

  1. first of all you cannot use strlen in MYSQL. That is an Microsoft SQL Server dialect Instead you need to use length.

  2. Secondly, in the subquery you used, you were using two columns instead on one. This will cause the query to fail because the equals operator needs to be equal to the value in only one column.

So here is your query:

select u.name, avg(length(m.message)), count(*)
from Messages m
inner join Users u on m.sender = u.uid
group by u.name
order by avg(length(m.message)) desc limit 5;

I improved on P. Salmon's answer since I provided you with the name of the sender rather than their ID.

Hope this helps :)

Michele La Ferla
  • 6,775
  • 11
  • 53
  • 79
  • Thanks for the answer!I would like to ask something more if u are willing to help. If I wanted to define all the users who send messages and have a phone starting with "83" why does the following quote returns me 0? Do I use LIKE wrong? sqlfiddle.com/#!17/7303a5/1 – MJ13 May 03 '19 at 11:49
  • 1
    When using `LIKE`, you need to use the percentage sign for it to work well. Your query would be something like this: `select u.name, count(m.sender) from Users u inner join Messages m on m.sender = u.uid where u.phone LIKE '83%' group by u.name having count(m.sender) > 1;` I included the `group by` and `having` clauses since you asked for those users who at least sent one message. – Michele La Ferla May 03 '19 at 12:07
  • One last question that confused me. Let's take all the users that have received (one or many) messages from the user 'peter'. I want to define the number of users that have sent (one or many) messages to ALL the users that have received a message from 'peter' . On my query I define the number of users that have sent messages to at least one of the users that have received a message from 'peter'. How can I modify it to make it give me the desired result? Thanks in advance! http://www.sqlfiddle.com/#!17/a63bc/2 – MJ13 May 03 '19 at 12:33
  • This was tricky :) The query won't return any results because in reality no user who has received a message from Peter, has in turn messaged ALL the other users who received a message from Peter. However if you need the query, here it is: http://sqlfiddle.com/#!9/a63bc7/1/0 – Michele La Ferla May 03 '19 at 13:18
  • 1
    Thanks a lot!I am just not used in the newer explicit join style. I tend to use the join syntax with the where clause. So I'll have to take my time to check it. Seems to work fine by the way! – MJ13 May 03 '19 at 13:35
  • If u have the time please check my previous question about the same database and a different query : https://stackoverflow.com/questions/55950905/confused-with-an-sql-query (you have already helped me a lot) From the answers I have got only one seems to give the desired result. However, I am not familiar with expressions like "least" and "greatest" so I am looking for an answer that is close to the top voted one. However, this one doesn't work right for the reason I explain in the comment section. – MJ13 May 04 '19 at 15:09
1

To find out, I have changed the DMBS from MySQL to Postgres, which supports inner limit. Your query has correct syntax, except the strlen() function, the correct one is length().

However, your query fails for a simple reason: you are doing a where sender in (subquery), although your subquery returns two fields. The in operator only works with single field queries. Moreover, your subquery is composed of two queries, which can be simplified to one. The following query works on Postgres 9.6, and should work on whatever version of MySQL with inner limit support:

SELECT avg(length(message))
FROM Messages
WHERE sender IN (
    SELECT sender
    FROM Messages
    GROUP BY sender
    ORDER BY COUNT(sender) DESC
    LIMIT 5
)

It produces the following result when run on your sample data:

+----------+
|   avg    |
+----------+
|   6.25   |
+----------+

Working SQL Fiddle (Postgres 9.6): http://www.sqlfiddle.com/#!17/bd36d/6/0

TheWildHealer
  • 1,546
  • 1
  • 15
  • 26
  • Wasn't so sure either, but turns out you can, both on Postgres and MySQL. Please mark my answer as accepted if it answers the question. – TheWildHealer May 03 '19 at 10:00
  • The answer is helpful. The only problem is that I think it doesn't return the desired result cause what I want is the average length of messages for the top 5 users (5 results). This returns the average length of messages from the average length of messages for each one of the top 5 users (1 result). – MJ13 May 03 '19 at 10:15
  • Okay, did not understand it like that. Then P.Salmon answer's is the one you are looking for. I have suggested an edit to your question to better reflect what you want. – TheWildHealer May 03 '19 at 10:27
  • 1
    Yeah the query is indeed ambiguous. I am not even 100% sure if this is exactly what the query asks. Just let the answer as it is cause it helped me understand better the problem and there's a possibility it answers my question as well. Thanks for helping me. – MJ13 May 03 '19 at 10:32
  • I would like to ask something more if u are willing to answer. If I wanted to define all the users who send messages and have a phone starting with "83" why does this quote returns me 0? Do I use LIKE wrong? http://www.sqlfiddle.com/#!17/7303a5/1 – MJ13 May 03 '19 at 11:17
  • cause you have spaces between your underscores – TheWildHealer May 03 '19 at 12:01