1

I am denying indexing to a folder called pdf via robots.txt. However, I do direct link to a few files that exist in that directory.

Will search engines such as Google index those files, or ignore them because they reside in the pdf folder?

kylex
  • 1,421
  • 5
  • 14
  • 18

1 Answers1

1

Short answer: No.

Crawlers are disallowed from indexing anything under the URL prefix you put in robots.txt.

Longer answer: It depends.

The Allow keyword is not part of the standard but some robots will follow it. You can use this to Allow a particular URL and Disallow the entire subtree that contains that URL. Most bots work on a first-match-wins basis. Google and Bing work on a longest-string-wins basis regardless of the order of the Allow and Disallow lines.

Ladadadada
  • 26,337
  • 7
  • 59
  • 90