0

My site changes the language according to the language of the user's browser. I want to bring up to all Spinders/bots the English-language site (like twitter). What is HTTP_ACCEPT_LANGUAGE of spiders/bots? How do I detect bots/spiders to include the file translation in the English language? I've seen the method of making a list of spiders / bots but I find it unsatisfactory. Have you better solutions?

user2584820
  • 3
  • 1
  • 4
  • what happens when HTTP_ACCEPT_LANGUAGE is not set? –  Jul 18 '13 at 00:07
  • If HTTP_ACCEPT_LANGUAGE is not set or HTTP_ACCEPT_LANGUAGE has not been translated is included translation in the English language. Since the bot does not release the language, is the site automatically translate in English? – user2584820 Jul 18 '13 at 00:23
  • correct, bot will see 'default' version –  Jul 18 '13 at 00:24

1 Answers1

0

You can do something like this:

function isSpider()
{
    $spiders = array("googlebot","WebCrawler","Slurp","msn", "VoilaBot", "FurlBot", "NaverBot", "MMCrawler");
    $spider_count = 0;
    foreach($spiders as $Val) {
        if (preg_match("/$Val/i", getenv("HTTP_USER_AGENT"))) {
            $spider_count++;
        }
    }
    if ($spider_count!= "0") {
        return true;
    }
        else
        {
            return false;
        }
}
if (isSpider())
{
        // Set the language to English
}

You can find a list of bot names here: Spider names

This assumes the bot sets the user agent, which is a valid assumption for search engines crawlers.

Sylverdrag
  • 8,898
  • 5
  • 37
  • 54
  • no need, the bot does not send the language header, so what ever the site is defaulted to will be crawled –  Jul 18 '13 at 00:16