-1

I'm currently running a web service where people can browse products. The URL for that is basically just /products/product_pk/. However, we don't serve products with certain product_pks, e.g. nothing smaller than 200. Is there hence a way to discourage bots to hit URLs like /products/10/ (because they will receive a 404)?

Thank you for your help :)

pasql
  • 3,815
  • 5
  • 23
  • 33

1 Answers1

0

I am pretty sure that crawlers don't try and fail autogenerated urls. It crawls your website and find the next links to crawl. If you have any links that return 404, that is bad design on your site, since they should not be there.

Kent Kostelac
  • 2,257
  • 3
  • 29
  • 42
  • Hey, there are no links embedded that would lead to a 404 page because that would indeed be bad design... – pasql Nov 10 '15 at 01:33
  • Then don't worry. No crawler will ever try a url - */products/product_pk/* where *product_pk* is below 200. Because the crawler will never find such an url while crawling. – Kent Kostelac Nov 10 '15 at 12:19
  • Unfortunately I have exactly this case and hence I added all URLS with non-supported product_pk's to the disallowed section. Right now I don't get any hits / 404 error warning mails anymore. However, this seems like a pretty bad hack :/ – pasql Nov 10 '15 at 21:58
  • Very bad hack. You should remove any url that has an 404. Or redirect all bad urls to an other page. Or keep your current solution. – Kent Kostelac Nov 10 '15 at 23:44