1

I am building a high performance API. I have been using Tastypie for ages and sometimes I just need more simplicity. For this API I have decided to use Django Simple Rest (https://github.com/croach/django-simple-rest). It provides the base of what is needed and I can use forms and the ORM to validate and save data with no generic API library overhead.

I want to verify the data that is coming in. I am using model forms to do so. It's nice and simple, it verifies data against the model but I need a little bit more.

I want to make sure no script or HTML gets posted. For some fields I might allow HTML. I know I can use html5lib to do all sorts of validation and I probably will but the only examples I have seen are where you specify every field. I am trying to work out a way to by default prevent javascript or HTML being entered into a field and to be able to override as appropriate. I don't want to have to describe every model in forms, I want something generic.

Here is my simplerest put function.

    def put(self, request, *args, **kwargs):
        data = json.loads(request.body)
        try:
            todo = Item.objects.get(id=kwargs.get('id'))
        except Item.DoesNotExist:
            return HttpNotFound()

        form = TodoForm( instance=todo, data=data )
        if not form.is_valid():
            return JsonFormErrors( form )
        form.save()

        return JsonStatus(True, message="saved successfully")

Here is my form.

from django import forms
from .models import *

class TodoForm(forms.ModelForm):

    class Meta:

        model = Item
        fields = ('id', 'text')

What is the best way to provide generic protection to all my put methods and forms with an ability to override the behaviour if I want to accept HTML.

I appreciate your help!

Rich

Rich
  • 1,769
  • 3
  • 20
  • 30
  • You most likely want to clean your arbitrary HTML input with `lxml.html.clean()` http://lxml.de/lxmlhtml.html – Mikko Ohtamaa Jul 02 '15 at 22:31
  • OK, but to gracefully how that to happen across all input. I am trying to extend the functionality of model forms I think. – Rich Jul 02 '15 at 22:42

1 Answers1

1

You might be able to create a new class inheriting from ModelForm that cleans each of the values immediately.

I was thinking of something like this:

from django import forms
from lxml.html.clean import clean_html

class SanitizedModelForm(ModelForm):
    def __init__(self, data=None, *args, **kwargs):
        if data is None:
            data = {}
        sanitized_data = {}
        for key, value in data:
            sanitized_data[key] = clean_html(value)
        super(SanitizedModelForm, self).__init__(sanitized_data, *args, **kwargs)

I'm not sure if clean_html is the right method for your scenario.

schillingt
  • 13,493
  • 2
  • 32
  • 34
  • Thanks @Schillingt. That is the right solution, I went with html5lib in the end but the subclassing was the trick that I needed. – Rich Jul 07 '15 at 19:54