This is a tricky one to explain. I believe the google bot is getting confused because of the way iis/sites are set up. The actual issue is, when searching Google and the result is www.someSiteURL.com the description underneath is:
A description for this result is not available because of this site's robots.txt – learn more.
I think the reason the issue exists is fairly clear. Using the example above there is not page content at www.someSiteURL.com/default.asp At this level there is a default.asp file with a whole bunch of redirects to take the user to the correct physical dir where the sites are. The sites are all living under one root 'Site' in IIS like so:
siteOneDir
siteTwoDir
siteThreeDir
default.asp (this is the page with the redirects)
How do you overcome this without chnaging the site setup/use of IPAddresses?
Here is the robots.txt file:
User-agent: *
Allow: /default.asp
Allow: /siteOneDir/
Allow: /siteTwoDir/
Allow: /siteThreeDir/
Disallow: /
BTW Google webmaster tool says this is valid. I know some clients may not recognize 'Allow' but Google and Bing do so I don't care about this. I would rather disallow all then only allow sites instead of only using this to disallow specific sites.
If I use the Google webmaster tool Crawl > Fetch a Google and type in www.someSiteURL.com/default.asp it does have a status of 'Redirected' and its status is http/1.1 302 found