A billion rows with a field that is 100k is, to say the least, big. That comes to 100 Tbytes of data (using the American definition of "terabyte"). According to the documentation:
The InnoDB storage engine maintains InnoDB tables within a tablespace
that can be created from several files. This enables a table to exceed
the maximum individual file size. The tablespace can include raw disk
partitions, which permits extremely large tables. The maximum
tablespace size is 64TB.
In other words, you may have bigger problems than performance. You will probably be spreading the table across multiple partitions.
If you are only occasionally retrieving the text and never using it for searches, I would suggest that you store it in a separate table. That way, you can customize that table for access to these records. You'll have a primary key used for reference and all references will be through that id.
If you are using the text for searches, particularly searches combined with the "fixed" data, then my architectural preference would be to include it in the base table to facilitate the searching across fields.
However, even with this preference, it is probably safer to put it in a different table. For instance, MySQL instantiates subqueries. It is very typical to use *
for a subquery. Consider a simple case: a query to get the 1000 most recent records ordered by userid:
select t.*
from (select t.*
from t
order by createddate
limit 1000
) t
order by userid
The use of t.*
means that the text column would also be retrieved. So a query that might take a fraction of a second (with an index) would have to read and write 1000*100k = 100 Mbytes of data (at least). This would probably take a bit longer.
In conclusion, I would advocate putting the text column in a table where it is often being searched with other columns -- for example, in a database of abstracts of scientific papers. For really large data, I would put it in a separate field, so I could better manage the storage in extreme cases.