In my multisite application, I need to include a robot.txt file for each of the site. The implementation for this goes as follows:
1- Included a RobotsContent property of type textarea within the Start page.
2- Added a hander as given below with a web config entry for the handler.
public void ProcessRequest(HttpContext context)
{
var uri = context.Request.Url;
var currentSite = _siteDefinitionRepository.List().FirstOrDefault(siteDefinition => siteDefinition.Hosts.Any(hostDefinition => hostDefinition.Authority.Hostname.Equals(uri.Host)));
if (currentSite != null)
{
var startPage = _contentLoader.Get<StartPage>(currentSite.StartPage);
var robotsContentProperty = startPage.RobotsContent;
// Generate robots.txt file
// Set the response code, content type and appropriate robots file here
if (!string.IsNullOrEmpty(robotsContentProperty))
{
context.Response.ContentType = "text/plain";
context.Response.Write(robotsContentProperty);
context.Response.StatusCode = 200;
context.Response.End();
}
}
}
I am aware there are a few nuget packages available for handling robot.txt but for some reasons & the need to have more control on this one ,I created a custom one. The above works as expected.
Referreing https://developers.google.com/search/docs/advanced/robots/create-robots-txt
It mentions that the rules are case sensitive ,comes in a group(user-agent, allow, disallow),directives(user-agent, allow, disallow )are required. With all these rules in place & this being a free textarea,I can add any random stuff within this.So is there any validations that I can apply to this?There are online validations avaliable for this but is there any way I can validate the text when it is being published.