Yes, it will be just as fast.
(In addition to the excellent points made Mike's answer...) there's an important point we should make regarding indexes (B-tree indexes in particular):
The entries in the index are stored "in order".
The index is also organized in a way that allows the database to very quickly identify the blocks in the index that contain the entries it's looking for (or the block that would contain entries, if no matching entries are there.)
What this means is that the database doesn't need to look at every entry in the index. Given a predicate like the one in your question:
WHERE name = 'John'
with an index with a leading column of name
, the database can eliminate vast swaths of blocks that don't need to be checked.
Blocks near the beginning of the index contain entries 'Adrian'
thru 'Anna'
, a little later in the index, a block contains entries for Caleb
thru Carl
, further long in the index James
thru Jane
, etc.
Because of the way the index is organized, the database effectively "knows" that the entries we're looking for cannot be in any of those blocks (because the index is in order, there's no way the value John
could appear in those blocks we mentioned). So none of those blocks needs to be checked. (The database figures out in just a very small number of operations, that 98% of the blocks in the index can be eliminated from consideration.
High cardinality = good performance
The take away from this is that indexes are most effective on columns that have high cardinality. That is, there are a large number of distinct values in the column, and those values are unique or nearly unique.
This should clear up the answer to the question you were asking. You can add brazilians of rows to the table. If only five of those rows have a value of
John
in the name column, when you do a query
WHERE name = `John`
it will be just as fast. The database will be able to locate the entries your looking for nearly as fast as it can when you had a thousand rows in the table.
(As the index grows larger, it does add "levels" to the index, to traverse down to the leaf nodes... so, it gets ever so slightly slower because of a tiny few more operations. Where performance really starts to bog down is when the InnoDB buffer cache is too small, and we have to wait for the (glacially slow in comparison) disk io operations to fetch blocks into memory.
Low cardinality = poor performance
Indexes on columns with low cardinality are much less effective. For example, a column that has two possible values, with an even distribution of values across the rows in the table (about half of the rows have one value, and the other half have the other value.) In this case, the database can't eliminate 98% of the blocks, or 90% of the blocks. The database has to slog through half the blocks in the index, and then (usually) perform a lookup to the pages in the underlying table to get the other values for the row.
But with gazillions of rows with a column gender
, with two values 'M'
and 'F'
, an index with gender
as a leading column will not be effective in satisfying a query
WHERE gender = 'M'
... because we're effectively telling the database to retrieve half the rows in the table, and it's likely those rows are going to be evenly distributed in the table. So nearly every page in the table is going to contain at least one row we need, the database is going to opt to do a full table scan (to look at every row in every block in the table) to locate the rows, rather than using an index.
So, in terms of performance for looking up rows in the table using an index... the size of the table isn't really an issue. The real issue is the cardinality of the values in the index, and how many distinct values we're looking for, and how many rows need to be returned.