0

I browsed the web trying to find the ideal robots.txt content for a hosted WordPress blog. I found several options, for example here and here.

I thought this would be a good question for ServerFault: for a "simple" blog over WordPress, what would be the ideal robots.txt?

Currently I have the following robots.txt that I found elsewhere on the web:

User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /trackback
Disallow: /feed
Disallow: /comments
Disallow: /category/*/*
Disallow: */trackback
Disallow: */feed
Disallow: */comments
Disallow: /*?*
Disallow: /*?
Allow: /wp-content/uploads


# Google Image
User-agent: Googlebot-Image
Disallow:
Allow: /*


# Google AdSense
User-agent: Mediapartners-Google*
Disallow:
Allow: /*


# Internet Archiver Wayback Machine
User-agent: ia_archiver
Disallow: /


# digg mirror
User-agent: duggmirror
Disallow: /

Thanks

Roee Adler
  • 266
  • 1
  • 2
  • 10
  • 1
    There is a WordPress StackExchange site winding it's way through the process at Area51. I invite any WordPress users/admins here to check it out and "commit" if you think it would be helpful. I did! http://area51.stackexchange.com/proposals/1500/wordpress-answers – tomjedrz Jul 24 '10 at 16:49

2 Answers2

2

There is no "ideal" robots.txt, although there will be one that's ideal for you. Just work out what you do want the bots to see and create a robots.txt which disallows everything else. There is no need for "allow" lines, as the robots parse these files to determine what you don't want them to look at and then assume everything else is fair game. e.g. The part of my own robots.txt that applies to wordpress is:

Disallow: /blog/wp-*.php
Disallow: /blog/wp-admin/
Disallow: /blog/wp-includes/
Disallow: /blog/wp-content/
John Gardeniers
  • 27,458
  • 12
  • 55
  • 109
  • This is a much better option than what you have. Adding your feed to google will see articles getting indexed much quicker. Quickest I've seen is under an hour so far. – Ryaner Oct 13 '09 at 11:11
0

I've never considered using a robots.txt file with wordpress before - I just make sure that permissions on files I don't want random users running (like the installer or upgrader) are correct.

warren
  • 18,369
  • 23
  • 84
  • 135
  • Permissions serve a different (but even more important) function. Robots.txt is only to tell bots what you do and don't want them to index. It has no effect on what users, or badly behaved bots, can see. – John Gardeniers Sep 09 '09 at 13:03
  • guess I figure that if you know it's wordpress, you'll also know what the subdirs are :) – warren Sep 09 '09 at 14:36