0

We have an Umbraco site in a load balanced environment and we need to make sure only the actual URL gets crawled and not the different production URLs.

We only want example.com to be indexed while load balancers at production1.example.com and production2.example.com are not.

Do I add a disallow rule for those URLs to the robots.txt, or add a meta nofollow tag to the head? Or is there another way to have the load balancing URLs not indexed by crawlers?

random
  • 9,774
  • 10
  • 66
  • 83
Ingen Speciell
  • 107
  • 2
  • 13

1 Answers1

0

Best solution: Don't make node-specific URLs publicly available (we usually use local ip/port to check a site on a specific node).

Since you have those domains, you may serve a different robots.txt depending on the domain (using URL rewriting).

marapet
  • 54,856
  • 12
  • 170
  • 184