0

I've researched many methods on how to prevent Google/other search engines from crawling a specific directory. The two most popular ones I've seen are:

  1. Adding it into the robots.txt file: Disallow: /directory/
  2. Adding a meta tag: <meta name="robots" content="noindex, nofollow">

Which method would work the best? I want this directory to remain "invisible" from search engines so it does not affect any of my site's ranking.

In other words, I want this directory to be neutral/invisible and "just there." I don't want it to affect any ranking. Which method would be the best to achieve this?

user2154729
  • 97
  • 1
  • 9

1 Answers1

1

Robots.txt is the way to go for this.

According to Google, you only use the meta tag if you don't have rights to create/edit the robots.txt file.

canhazbits
  • 1,664
  • 1
  • 14
  • 19
  • Thanks! Also, by adding this to the robots.txt file, will it make the directory "invisible" to the search engine? Meaning will it not affect the ranking of the other pages on my site. – user2154729 Jun 30 '13 at 18:10
  • @user2154729 Nothing you put on the web is "invisible". But Google is expected to respect the `robots.txt` file and not download forbidden which, which then should not have any impact whatsoever on your other pages. – janos Jun 30 '13 at 18:31
  • @janos Thanks! Lastly, do you think it's safe to just place a 404 or 403 error for the entire directory in my .htaccess file, according to the IP? This directory is mainly just for me and a few other people. – user2154729 Jun 30 '13 at 18:50
  • @user2154729 It's *safer* ;-) (than not having it) – janos Jun 30 '13 at 18:55
  • @janos Only a few files in the directory will be available to the public, the rest of the files (including the main directory URL) will have either a 404 or a 403 error. Which one would be better for this case (404 or 403)? Keep in mind that I want to maintain the ranking of the rest of the pages on my site, so I don't want this 404 or 403 error affecting it. – user2154729 Jun 30 '13 at 19:03
  • 403 is more appropriate, as it means *Forbidden*. Google will not attempt to download files that are matched by your `robots.txt`, so it will not hit any 403 errors. – janos Jun 30 '13 at 19:11
  • @janos So all in all, if I add a 403 error to the directory and only to a FEW files in the directory, will it affect the overall ranking of the other content on my website? – user2154729 Jun 30 '13 at 19:16
  • @user2154729 I don't think Google Bot will come from just one IP, it probably has many IPs - so don't try to hardcode what you see from them as their IP, because I'm guessing it could/will change. – canhazbits Jun 30 '13 at 19:38
  • @canhazbits I know, I'm doing it the other way around. Just allowing access to the directory and specific files from my IP. – user2154729 Jun 30 '13 at 19:42