3

It is a classical question and I know there are many work around like here: Select a Column in SQL not in Group By but they do not work for my issue on Bigquery.

I have a table with tweets from Twitter and I want a ranking for the urls including any tweet text.

ID         tweet               url
1          my github tweet     http://www.github.com/xyz
2          RT github tweet     http://www.github.com/xyz
3          another tweet       http://www.twitter.com
4          more tweeting       http://www.github.com/abc

I tried the following query, but then id 1 and 2 are counted separately.

SELECT tweet, count(url) as popularity, url FROM table group by tweet, url order by popularity desc

How can I count/rank the urls correctly and still preserve any associated tweet text in the result? I do not care if it is from ID 1 or 2.

Community
  • 1
  • 1
crisscross
  • 1,675
  • 2
  • 18
  • 28

2 Answers2

2

Here is one approach:

SELECT url, COUNT(*) AS popularity, GROUP_CONCAT(tweet) 
FROM Table GROUP BY url ORDER BY popularity

GROUP_CONCAT aggregation function will concatenate all the tweets associated with same URL using comma as separator (you can pick another separator as second parameter to GROUP_CONCAT).

Mosha Pasumansky
  • 13,206
  • 5
  • 32
  • 55
  • Works perfect. I added a left() to truncate the output and I am happy with that. – crisscross Apr 06 '15 at 10:51
  • @crisscross in case you only want one tweet, you could also do SELECT url, COUNT(*) AS popularity, MAX(tweet) FROM Table GROUP BY url ORDER BY popularity (GROUP_CONCAT will struggle if there are too many tweets to group by) – Felipe Hoffa Apr 06 '15 at 17:16
0

I'm not sure this will work with google-bigquery or not, I haven't experience with it but this is a solution with pure sql I thought it may works for you.

get the count of urls in a subquery and then join it with the table on url:

select t.id,t.tweet,t.url,q.popularity
from table t
join 
(SELECT url, count(url) as popularity
FROM table group by url) q
on t.url=q.url
order by q.popularity desc
void
  • 7,760
  • 3
  • 25
  • 43
  • Thanks so much for the quick reply. It did not completely worked out like that, but I am glad to see there is an option with a join statement. – crisscross Apr 06 '15 at 10:51