-5

I want to disallow all the bots to crawl specific type of pages. I know this can be done via robots.txt as well as .htaccess. However, these pages are generated from the database from the user's request. I have searched the internet and could not get a good answer for doing so.

My link looks like:

http://www.my_website/some_controller/some_action/download?id=<encrypted_id>

There is a view page for the users wherein all the data that is displayed comes from the database including the kind of links that I have mentioned before. I want to hide those links from the bots and not the entire page. How can I do that?

Abhimanyu Saharan
  • 642
  • 1
  • 10
  • 26
  • 1
    This question is off-topic for Stack Overflow as it doesn't appear to be about programming. – AStopher Feb 20 '15 at 18:39
  • @Abhimanyu Simply create a `robots.txt` file at the root of your site and in it put `User-agent: * Disallow: /`, this'll stop anything crawling your site. See [here](http://www.robotstxt.org/robotstxt.html). – AStopher Feb 20 '15 at 18:50
  • @ʎǝʞuoɯɹǝqʎɔ I can't do that. I need the page to be crawled but not that link. – Abhimanyu Saharan Feb 20 '15 at 18:52
  • As long as `.htaccess` and `robots.txt` are autocreated, you cannot do this (unless your application allows custom rules to be inserted). – AStopher Feb 20 '15 at 18:53
  • @ʎǝʞuoɯɹǝqʎɔ Yeah, I can insert custom url rules. – Abhimanyu Saharan Feb 20 '15 at 18:58

2 Answers2

2

Could the page not be generated with a

<meta name="robots" content="noindex">

in the head?

BradleyDotNET
  • 60,462
  • 10
  • 96
  • 117
Darryl Edwards
  • 119
  • 1
  • 10
0

you cannot hide stuff from bots but make it available to other traffic, afterall how do you distinguish between a bot and regular traffic... you cant without some sort of verification like them pictures of a word you type in a box. Robots.txt does not stop bots, most bots will look at it and that will stop them out of there own choice, however that is only because they are programmed to do so. They do not have to do this and therefore if they wish can ignore robots.txt completely.

Daniel Baron
  • 34
  • 1
  • 5