robots.txt
robots.txt is the filename used for implementing the Robots Exclusion Protocol, a standard used by websites to indicate to visiting web crawlers and other web robots which portions of the website they are allowed to visit.
This relies on voluntary compliance. Not all robots comply with the standard; indeed, email harvesters, spambots, malware and robots that scan for security vulnerabilities may very well start with the portions of the website they have been asked (by the Robots Exclusion Protocol) to stay out of.
The "robots.txt" file can be used in conjunction with sitemaps, another robot inclusion standard for websites.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.