0

I created a new website and I do not want it to be crawled by search engines as well as not appear in search results.

I already created a robots.txt

User-agent: *
Disallow: /

I have a html page. I wanted to use

<meta name="robots" content="noindex">

but Google page says it should be used when a page is not blocked by robots.txt as robots.txt will not see noindex tag at all.

Is there any way I can use both noindex as well as robots.txt?

user2961712
  • 469
  • 1
  • 7
  • 17

1 Answers1

0

There are two solutions, neither of which are elegant.

You are correct that even if you Disallow: / that your URLs might still appear in the search results, just likely without a meta description and a Google generated title.

Assuming you are only doing this temporarily, the recommended approach is to be basic http auth in front of your site. This isn't great since users will have to put in a basic username and password, but this will prevent your site from getting crawled and indexed.

If you can't or don't want to put basic auth in front of your site, the alternative is to still Disallow: / in your Robots.txt file, and use Google Search Console to regularly purge the Google index by requesting the site be removed from the index.

This is inelegant in multiple ways.

  1. You'll have to monitor the search results to see if URLs get indexed
  2. You'll have to manually request the removal in the Google Search Console
  3. Google really didn't intend for the removal feature to be used in this fashion, and who knows if they'll start ignoring your requests over time. But I'd imagine it would actually continue to work even though they'd prefer you didn't use it that way.
eywu
  • 2,654
  • 1
  • 22
  • 24