Your .htaccess
file cannot magically distinguish "real" users from "bot" users. Since from the webserver's perspective, there is no distinction.
However, as a general rule, bots will respect the contents of robots.txt
, while web browsers do not.
Alternately, if you had some way of determining what was a bot and what was not, you could work that rule into your .htaccess
configuration. A common tactic is to apply a set of RewriteRules that filter based on the reported User-Agent
header. For example, a user-agent that contains the word "googlebot" is probably run by Google.
User-Agents.org has a list of popular user-agent identifiers. But remember that the contents of this header are set by the person running the bot/browser, and can contain anything she wants it to. So, for example, malicious users will typically copy the User-Agent string from a popular browser or perhaps a popular search engine. So you can't depend on this.