I look after a large site and have been studying other similar sites. In particular, I have had a look at flickr and deviantart. I have noticed that although they say they have a whole lot of data, they only display up to so much of it.
I persume this is because of performance reasons, but anyone have an idea as to how they decide what to show and what not to show. Classic example, go to flickr, search a tag. Note the number of results stated just under the page links. Now calculate which page that would be, go to that page. You will find there is no data on that page. In fact, in my test, flickr said there were 5,500,000 results, but only displayed 4,000. What is this all about?
Do larger sites get so big that they have to start brining old data offline? Deviantart has a wayback function, but not quite sure what that does.
Any input would be great!