0

I've created a website using Django and added robots.txt using the code :

path('robots.txt', lambda r: HttpResponse("User-agent: *\nDisallow: /", content_type="text/plain")), in my main urls.py , it works great but now i need to add some rules to it .. how to do it

Ahmed Wagdi
  • 3,913
  • 10
  • 50
  • 116

3 Answers3

1

robots.txt is not just an HttpResponse. It is an actual file.

You can either continue to fabricate the whole response manually using the lambda function. In this case you need to keep building up a string response.

Or you could write a file to server's disk, write rules to it, etc. and serve that file upon request to robots.txt

Further reading on robots.txt (not related to django)

Related SO question: django serving robots.txt efficiently

Adelin
  • 7,809
  • 5
  • 37
  • 65
1

You can write the robots.txt under your template and then serve it as follows if you want to serve it through Django:

from django.conf.urls import url
from django.views.generic import TemplateView

urlpatterns = [
    url(r'^robots.txt$', TemplateView.as_view(template_name="robots.txt", content_type="text/plain"), name="robots_file")
]

However recommended way is to serve through your web server directives.

Nginx:

location  /robots.txt {
    alias  /path/to/static/robots.txt;
}

Apache:

<Location "/robots.txt">
 SetHandler None
 Require all granted
</Location>
Alias /robots.txt /var/www/html/project/robots.txt
MohitC
  • 4,541
  • 2
  • 34
  • 55
1

in your main app urls.py

from django.urls import path, include
from django.views.generic.base import TemplateView


urlpatterns = [
    # If you are using admin
    path('admin/', admin.site.urls),
    path(
        "robots.txt",
        TemplateView.as_view(template_name="robots.txt", content_type="text/plain"),
    ),
    path(
        "sitemap.xml",
        TemplateView.as_view(template_name="sitemap.xml", content_type="text/xml"),
    ),
]

Then go to your template root folder and create a robots.txt file and you can add something like this

User-Agent: *
Disallow: /private/
Disallow: /junk/

Got to your tempalte root folder again and create another file sitemap.xml and you can add somemthing like this or get it done properly with sitemaps generator here is an example:

<url>
<loc>https://examplemysite.com</loc>
<lastmod>2020-02-01T15:19:02+00:00</lastmod>
<priority>1.00</priority>
</url>

Now if you run python manage.py runserver you can test it 127.0.0.1:8000/sitemap.xml and /robots.txt and it will work. But this won't work in your production server because you will need to let nginx know about this and give the paths.

So you will need to ssh into your server and for example in nginx you should have a configuration file that you named when you built it. You should cd into /etc/nginx/sites-available in that folder you should have the default file (which you should leave alone) and there should be another file there that you named, usually should be named same as your project name or website name. Open that file with nano but take a back up first. Next you can add your paths for both files like this:

Be aware of the paths, but obviously you can look at the file and you should get an idea you should see the path to static file or media. So you could do something like this.

location  /robots.txt {
    root  /home/myap-admin/projects/mywebsitename/templates;
}
location  /sitemap.xml {
    root  /home/myap-admin/projects/mywebsitename/templates;
}

/home/myap-admin/projects/mywebsitename/templates you should know the path to your mywebsitename. This is just an example path that leads to templates folder.

Make sure you then run service nginx restart

Elias Glyptis
  • 470
  • 5
  • 9