7

I want to allow crawling of files in:

/directory/

but not crawling of files in:

/directory/subdirectory/

Is the correct robots.txt instruction:

User-agent: *
Disallow: /subdirectory/

I'm afraid that if I disallowed /directory/subdirectory/ that I would be disallowing crawling of all files in /directory/ which I do not want to do, so am I correct in using:

User-agent: *
Disallow: /subdirectory/
alex
  • 479,566
  • 201
  • 878
  • 984
user523521
  • 121
  • 1
  • 8

2 Answers2

7

You've overthinking it:

User-agent: *
Disallow: /directory/subdirectory/

is correct.

Matthew Flaschen
  • 278,309
  • 50
  • 514
  • 539
  • Isn't User-agent: * Disallow: /directory/subdirectory/ going to remove any files in /directory/ ? I still want the files in directory in the search index but not the files in the subdirectory /directory/subdirectory/ – user523521 Mar 22 '11 at 01:51
  • 1
    No, why would it do that? It's disallowing the subdirectory, not the parent. – Matthew Flaschen Mar 22 '11 at 01:54
  • Well...as part of my research many people on the internet are saying that disallowing /directory/subdirectory/ disallows all files in /directory/ also so that it is necessary to do:User-agent: * Disallow: /directory/subdirectory/ Allow: /directory/index.html i'm just trying to find out which is correct? – user523521 Mar 22 '11 at 02:49
  • 1
    @user, what resource says that? Either you misunderstood, or they're wrong. – Matthew Flaschen Mar 22 '11 at 02:55
2
User-agent: *
Disallow: /directory/subdirectory/

Spiders aren't stupid, they can parse a path :)

alex
  • 479,566
  • 201
  • 878
  • 984
  • I don't understand the implecation of what you're saying. – user523521 Mar 22 '11 at 01:54
  • 2
    @user If you do `cd /directory/subdirectory/` does it take you to `directory`? No, the significant folder is the last in the path, in this case the `subdirectory`. – alex Mar 22 '11 at 02:07