7

Does anyone have some idea as to how come questions posted here on SO are showing up so quickly on Google?.

Sometimes questions submitted are appearing as the first 10 entries or so - on the first page within 30 minutes of submitting a question. Pray tell, what sort of magic is being wielded here?

Anybody have some ideas, suggestions?. My first thought is that they have info in their sitemap that tells google robots to trawl every N minutes or so - is that whats going on?

BTW, I am aware that simply instructing Googlebots to scan your site every N minutes will not work if you dont have quality information (that is constantly being updated on your site).

I'd just like to know if there is something else that SO may be doing right (apart from the marvelous content of course)

Cœur
  • 37,241
  • 25
  • 195
  • 267
morpheous
  • 16,270
  • 32
  • 89
  • 120
  • Well, some questions seem to be there moments after they are submitted. This just partly SEO. Google will pull questions from SO in short intervals and include them immediately into their index; plus Google loves SO (not just for being an official android-forum). – miku May 27 '10 at 12:29
  • is some bot submitting the link to google? – Srinivas Reddy Thatiparthy May 27 '10 at 12:30
  • 2
    I'm not going to submit an answer because I honestly don't _know_ but I used to work at a pretty large news web site. Our traffic was monster and Alexa score was high, but our SEO was relatively poor and link-backs few. Our stats showed that most users came to us via a bookmark or typing our URL! Yet we were abnormally high in all Google searches. It wasn't spidering that did that. I think either there is a Yahoo-style team massaging results for Google and/or the Google Toolbar is helping monitor people's surfing habits. My gut is that Google deemed us popular and, thus, we are. – Andrew May 27 '10 at 13:06
  • @Andrew: would you care to explain what you mean by "there is a Yahoo-style team massaging results for Google". What is a "Yahoo style team" - I mean what would such a team be doing? (presumably Yahoo is doing something us mere mortals people are not aware of?) – morpheous May 27 '10 at 13:47
  • I guess that is a little dated. Back in the very early days, Yahoo was a category-based search engine. When you submitted a link, a human actually reviewed it to see that it fit into the categories you specified -- and you often got rejected or reassigned to another category. I don't think anybody does that any more as a primary source of information. Spidering and mining algorithms allow you to go much, much wider with virtually no added effort (but with potentially much less accuracy). But for making sure that high-traffic search areas remain relevant, humans might do better. Just a hunch. – Andrew May 27 '10 at 23:50

4 Answers4

7

To put it simply, more popular websites with more quality content and more frequent changes are ranked higher with Google's algorithm, and are indexed and cached more frequently than sites that are less popular or change less frequently.

Delan Azabani
  • 79,602
  • 28
  • 170
  • 210
  • 1
    Good answer, but I think you mean crawled and indexed, not indexed and cached. – Stephen May 27 '10 at 12:46
  • 1
    Nope he's right. Check some SERPs. You'll notice a page may be indexed, but often times not cached until Google determines it should be. – hsatterwhite May 27 '10 at 15:32
5

Broadly speaking, it's only content that does it. The size and quality of content has reached Google's threshold for "spider as fast as the site will permit". SO has to actively throttle the Googlebot; Jeff has said on Coding Horror that they were getting more then 50,000 requests per day from Google, and that was over a year ago.

If you scan through non-news sites from the Alexa top 500 you will find virtually all of those have results in Google that are just minutes old. (i.e. type site:archive.org into Google and choose "Latest" in the menu on the left)

So there's nothing practical you can do to your own site to speed up spidering, except to increase the amount of traffic to your site...

Colin Pickard
  • 45,724
  • 13
  • 98
  • 148
  • Decent answer, except that Google can't determine "quality of content" since it is a machine, not a person. (Sure, it uses various methods like in-links but it still can't directly determine a page's quality.) – DisgruntledGoat May 29 '10 at 11:32
1

It is really simple.

SO is a PageRank 6 site that gives the world new information.

Google has a strong bias on new information. It will crawl the site many times a day and it will immediately add the pages to its index. It will favor a page (top 10) to say a specific query for a small period of time (a few days) and then it will stop favoring that page and rank it as normal.

This is standard G procedure and it happens with many many sites.

As you might guess, grayhat/blackhat seo uses that fact in many ways.

johnjohn
  • 4,221
  • 7
  • 36
  • 46
0

Also helped by SO providing an RSS feed, I think google likes feeds from reliable sources.

Richard Harrison
  • 19,247
  • 4
  • 40
  • 67