-1

I am wondering if there is a way to include in my robots.txt a line which stops Google from indexing any URL in my website, that contains specific text.

I have different sections, all of which contain different pages. I don't want Google to index page2, page3, etc, just the main page.

The URL structure I have is as follows:

http://www.domain.com/section
http://www.domain.com/section/page/2
http://www.domain.com/section/article_name

Is there any way to put in my robots.txt file a way to NOT index any URL containing:

/page/

Thanks in advance everyone!

Cristian
  • 6,765
  • 7
  • 43
  • 64

3 Answers3

2
User-agent: Googlebot  
Disallow: http://www.domain.com/section/*

or depending on your requirement:

User-agent: Googlebot 
Disallow: http://www.domain.com/section/page/*

Also you may use the Google Webmaster tools rather than the robots.txt file

Dave Hogan
  • 3,201
  • 6
  • 29
  • 54
  • The only problem I see is that articles take the URL www.domain.com/section/article_name, so disallowing * after section/ would disallow the articles, can I just disallow the URL if it has "/page" in it? Thanks for your help your answer is almost perfect! – Cristian Jun 29 '12 at 13:32
  • @Cristian - Use Google's Webmaster Tools if you want more control. – Security Hound Jun 29 '12 at 13:37
  • @Ramhound I do use google webmaster, I can't find where I can do what I'm asking, could you provide a little guidance? – Cristian Jun 29 '12 at 13:39
  • Using this, google only disapear page title in search result but page indexed and exist in search. – Mohammad Dec 17 '17 at 05:08
1
  1. Goto GWT / Crawl / URL Parameters
  2. Add Parameter: page
  3. Set to: No URLs
modernmagic
  • 141
  • 10
0

You can directly use Disallow: /page

Aashish Katta
  • 1,174
  • 3
  • 13
  • 22