-2

I have a question about SQL indexing in my theory class. It asks me to choose which column should be indexed to optimize these 2 queries: (This is 'paper' question so I am not provided the database to test these indexes with EXPLAIN)

1, First query.

SELECT BranchName, No_Of_Copies 
FROM BOOK as B, BOOK_COPIES as BC, LIBRARY_BRANCH as LB
WHERE B.BookId = BC.BookId and BC.BranchId=LB.BranchId and title ="The Lost Tribe";

I have the answer of this one, which is that BOOK.title, BOOK_COPIES.BranchId and LIBRARY_BRANCH.BranchId should be used for indexing. However, I don't really understand why BOOK.BookId and BOOK_COPIES.BookId are not chosen for indexing.

2, Second query

SELECT B.cardNo, Name, Address, COUNT(BookId,BranchId) 
FROM BORROWER as B, BOOK_LOANS as BL
WHERE (BL.CardNo=B.CardNo)
GROUP BY B.CardNo, Name, Address
HAVING COUNT(BL.BookId, BranchId)>5;

Would it be optimized if I create index on BOOK_LOANS.CardNo, BORROWER.CardNo, Name and Address ?

  • 3
    "This is 'paper' question so I am not provided the database to test these indexes" -- The paper won't stop you from creating a test database by *yourself*... – sticky bit Jun 15 '21 at 11:43
  • Incidentally, since 1992, your first query would normally be written this way: `SELECT lb.branchname, bc.no_of_copies FROM book b JOIN book_copies bc ON bc.bookid = b.bookid JOIN library_branch lb ON lb.branchid = bc.branchid WHERE title = "the lost tribe";` – Strawberry Jun 15 '21 at 12:00
  • And what does `COUNT(BookId,BranchId)` do? – Strawberry Jun 15 '21 at 12:02

1 Answers1

0

That class needs to be updated. Using commas in JOIN is antiquated; the new style uses JOIN .. ON

The question is ambiguous -- what table contains title? I'll assume it is B.

  1. Since the only filtering is on title, The Optimizer will pick B as the first table to look at:

     B needs INDEX(title)
    
  2. From B, it can reach for BC:

     BC needs INDEX(BookId)
    
  3. Similarly:

     LB needs INDEX(BranchId)
    

If you are using MySQL, be aware that a PRIMARY KEY is an index. And every table needs a PRIMARY KEY. Also a PRIMARY KEY is necessarily unique. So, when I said "needs", you may find that there is already a PRIMARY KEY satisfying the need.

More: http://mysql.rjweb.org/doc.php/index_cookbook_mysql

I will quibble with the schema -- why is the "number of books" not simply a column in Books?

As for query 2, it is even less clear which table each column might be in.

Do be aware that an INDEX can contain only columns from a single table.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Thank you for your answer ! Yeah I agree with you that these 2 queries are really ambiguous, but those were exactly what I am provided from my teacher. I asked her for the schema of the database or at least some information about the tables, but she said that I would not need that to do the questions. However, your explaination on the first query made it clear to me about the answer. As I understand, you mean that `BOOK.BookId` and `BOOK_COPIES.BookId` are primary keys, so they are already indexed and don't need to be included in the answer. – Nguyễn Hữu Hưng Jun 16 '21 at 09:57