I try to find how to block crawlers to access my links that are something like this:
site.com/something-search.html
I want to block all /something-*
Can someone help me?
I try to find how to block crawlers to access my links that are something like this:
site.com/something-search.html
I want to block all /something-*
Can someone help me?
In your robots.txt
User-agent: *
Disallow: site.com/something-(1st link)
.
.
.
Disallow: site.com/somedthing-(last link)
Add entry for each page that you don't want to be seen!
Though regex are not allowd in robots.txt some intelligent crawlers can understand it!
have a look here
User-agent: *
Disallow: /something-
This blocks all URLs whose path starts with /something-
, for example for a robots.txt accessible from http://example.com/robots.txt
:
http://example.com/something-
http://example.com/something-foo
http://example.com/something-foo.html
http://example.com/something-foo/bar
The following URLs would still be allowed:
http://example.com/something
http://example.com/something.html
http://example.com/something/