-4

Today our server got hit by a large number of requests from Facebook IPS in the range 66.220.159.XXX

The user agent give is : "Facebot/1.0"

I cant find any information on this on the facebook site, it seems its not the regular facebook user agents i.e ‘facebookexternalhit’ or ‘facebookplatform’.

I am trying to find some more information on this bot will it obey any sort of crawl delay directive ?

araresight
  • 19
  • 4

1 Answers1

2

Facebook's documentation includes this section on crawlers:

As of May 28th, 2014 you may also see a crawler with the following user agent string:

Facebot

Facebot is Facebook's web crawling robot that helps improve advertising performance. Facebot is designed to be polite. It attempts to access each web server no more than once every few seconds, in line with industry standards, and will respect your robots.txt settings.

As the docs say, it will respect robots.txt settings. You could try the (non-standard) Crawl-delay directive and see if Facebot respects that.

User-agent: Facebot/1.0
Crawl-delay: 10 # seconds
Community
  • 1
  • 1
Matt
  • 74,352
  • 26
  • 153
  • 180
  • Thanks, no idea how you found that i tried 'site:facebook.com "facebot"' to no avail and if you search 'facebot' in the developer portal the results are all junk :( – araresight Jul 23 '14 at 16:13
  • @araresight: TBH, I just googled "Facebot" and found [this forum thread](http://forums.oscommerce.com/topic/397418-what-is-facebot10-and-why-is-it-crawling-my-site/?p=1701200)... one of the answers linked to the relevant Facebook docs. – Matt Jul 23 '14 at 16:16
  • 1
    Haha Thanks Matt. I checked that thread but my reading skills are clearly subpar – araresight Jul 23 '14 at 16:18
  • Also glad to see this thread is already ranking 6th in google for it after only 17 minutes. – araresight Jul 23 '14 at 16:24
  • 1
    @araresight: In which case, it'd be great if you could report back on whether you had any success with `Crawl-delay`, to aid future Googlers! – Matt Jul 23 '14 at 16:25