-2

I have two tables named Student and Course and a join table named StudentCourse. Both Student and Course have an auto increment integer Id.

StudentCourse table will have at most 10,000 records (not more!) and I will use this table for select some data in most scenarios. Inserting performance is not my concern now and I just want to get the data fast!

My queries are in form of:

select * from studentcourse where studentid == sid and courseid == cid
select * from studentcourse where studentid == sid

Which one of implementations below is better? Why? Is it always better than the other or it depends? Is there even any better solution for my senario?

1: Using an auto increment Id column for StudentCourse as primary key and set a unique composite index like (StudentId, CourseId).

2: Using a composite primary key like (StudentId, CourseId)

3: Creating table with no primary key and just use the unique composite index like (StudentId, CourseId).


Update

What if we have some other properties that only matters when we have a relation! Consider Score or Grade as a 3rd column in StudentCourse table. Should we use 1st option in this situations? I mean lets assume we are using an ORM and we should have a reference to StudentCourse class (entity) in our Student class (entity).

  • 1
    Like both the answers below, I'd go for #2. Remember an index sorts the data by the appropriate columns, so for the clustered index (which typically by default comes with the primary key) you often want the first column to be one you filter by the most frequently. However, on a table like that, I often *also* have a non-clustered index in the reverse order (e.g., on `CourseId, StudentID`) so if you're doing a query from the other way (e.g., which students are in CourseID 421?) it also has an index it can use. – seanb Oct 28 '20 at 11:57
  • Following your update: The key part here is that the clustered index helps you when you need it. Some databases (e.g., SQL Server) by default make your clustered index the same as the PK. You can have a different clustered index than PK if you want (e.g., option 1) especially if you can have duplicates in what you want as the clustered index. The advantage of having the clustered index on `StudentID, CourseID` is that it helps when looking up course records for a specific student or set of students. Suggest https://www.brentozar.com/training/think-like-sql-server-engine/ - it really helped me. – seanb Oct 28 '20 at 21:36

2 Answers2

3

The first two options are fine. The important part is to have a unique index on (studentid, courseid). The unicity guarantees data integrity (the same student cannot take the same course twice), and the index improves the efficiency of the lookups.

Whether you should have a surrogate primary key or not is opened to discussion. If you need to uniquely identify each record somehow (like: refer to that table from another table for example), a single-column primary key might come handy.

I would not recommend the third option, because it does not enforce the integrity constraint.

Finally: in most databases, the ordering of the keys in the index does matter. For the two queries that you have showed, you want studentid in first position in the index, so the query that filter on this column only can still benefit the index.

GMB
  • 216,147
  • 25
  • 84
  • 135
3

2 is the best.

It's better than 1 because it has one less index (so less space/cache is taken and the index maintenance is less expensive). A surrogate key (like Id) is generally useless in a junction table.

It may be better than 3 depending on your DBMS: some DBMSes will not allow you to cluster the table on a unique index (just on a primary key)1. Even on a DBMS that allows it (like SQL Server), I'd go with 1 for being more direct and having simpler syntax.


1 And some will not allow you to reference a unique index (just PRIMARY KEY or UNIQUE constraint) from a FOREIGN KEY, though that's probably not relevant here. If it was, than you'd consider the surrogate key Id.

Branko Dimitrijevic
  • 50,809
  • 10
  • 93
  • 167