Use the source, Mike.
The resolver uses a linear search through the text file to locate entries. It's a database with no indexes. So, in the absence of additonal caching capability, the cost for lookups will be O(n). As to when that will result in a degradation in performance, thats an impossible question to answer - it gets slower with every record.
If you talk to a database programmer or admin you'll get different figures for the point at which an index lookup (O(log2(n)) is cheaper than a full table scan, but generally the answer will be in the region of 20 to 100 records.
Any linux system needing to resolve a lot of names (not just hostnames). Should be running nscd or similar. Most such caches will index data themselves which would nullify the performance question, however...
It provides no means for managing complex/large datasets - if you have a host with more than one IP address, lookups via the hosts file will always return the first entry.