4

I have a table of books, a table of authors, and a "linker" table (many to many links between authors/books).

How do I find the authors with the highest number of books?

This is my schema:

books : rowid, name

authors : rowid, name

book_authors : rowid, book_id, author_id

This is what I came up with: (but it doesn't work)

SELECT count(*) IN book_authors 
WHERE (SELECT count(*) IN book_authors 
       WHERE author_id = author_id)

And ideally I would like a report of the top 100 authors, something like:

 author_name      book_count
 -----------------------------------
 Johnny           25
 Kelly            12
 Ramboz           10

Do I need some kind of join? What is the fastest approach?

Community
  • 1
  • 1
Robin Rodricks
  • 110,798
  • 141
  • 398
  • 607
  • 1
    Couldn't you just `select count(1) , Author_ID from Book_Authors group by Author_ID order by count(1) desc limit 100` the authors with the most books would be at the top (or the author_ID would be at least) As for limiting to top 100... then add limit clause http://stackoverflow.com/questions/3325515/sqlite-limit-offset-query – xQbert Jul 07 '16 at 16:49
  • 1
    I think @Mureinik also has a valid answer matching your desired results. Just as valid and his posted sooner. Plus it joins to the other tables to return your desired results and allows you to add other columns from those tables if you desire. It will be a little slower as it has to process the joins but it will return what you asked for. – xQbert Jul 07 '16 at 16:56
  • The only reason it would be noticeably slower is if you don't have indexes on rowID in book/author table and Author_ID, Book_ID a in book_authors. or you really have millions of records in the tables. – xQbert Jul 07 '16 at 17:04

3 Answers3

5

I'd join the three tables (via the book_authors table), group by the author, count occurrences and limit it to the top 100 rows:

SELECT   a.name, COUNT(*)
FROM     authors a
JOIN     books_authors ba ON a.rowid = ba.author_id
JOIN     books b ON ba.book_id = b.rowid
GROUP BY a.name
ORDER BY 2 DESC
LIMIT    100

EDIT:
Actually, we aren't using any data from books, just the fact the book actually exists, which can be inferred from books_authors, so this query can be improved by dropping the second join:

SELECT   a.name, COUNT(*)
FROM     authors a
JOIN     books_authors ba ON a.rowid = ba.author_id
GROUP BY a.name
ORDER BY 2 DESC
LIMIT    100
Mureinik
  • 297,002
  • 52
  • 306
  • 350
1

Couldn't you just

select count(1) , Author_ID from Book_Authors group by Author_ID order by count(1) desc limit 100

The authors with the most books would be at the top (or the author_ID would be at least)

As for limiting to top 100... then add limit clause Sqlite LIMIT / OFFSET query

Community
  • 1
  • 1
xQbert
  • 34,733
  • 2
  • 41
  • 62
0

SELECT TOP 3 authors.author_name, authors.book_name, books.sold_copies, (SELECT SUM(books.sold_copies) FROM books WHERE authors.book_name = books.book_name ) AS Total FROM authors INNER JOIN books ON authors.book_name = books.book_name ORDER BY sold_copies desc

  • 1
    Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jul 12 '22 at 06:45