6

Before you tell me 'what have you tried', and 'test this yourself', I would like to note that robots.txt updates awfully slow for my siteany site on search engines, so if you could provide theoretical experience, that would be appreciated.

For example, is it possible to allow:

http://www.example.com

And block:

http://www.example.com/?foo=foo

I'm not very sure.

Help?

Lucas
  • 16,930
  • 31
  • 110
  • 182
  • 1
    you could try to use a robot emulator, but using **deny** in robots.txt doesn't mean all robots will follow it! – CSᵠ Jan 02 '13 at 23:35

1 Answers1

7

According to Wikipedia, "The robots.txt patterns are matched by simple substring comparisons" and as the GET string is a URL you should be able to just add:

Disallow: /?foo=foo

or something more fancy like

Disallow: /*?* 

to disable all get strings. The asterisk is a wildcard symbol so it matches one or many characters of anything.

Example of a robots.txt with dynamic urls.

Sean Dawson
  • 5,587
  • 2
  • 27
  • 34