4

I am writing a web application with Pyramid and would like to restrict the maximum length for POST requests, so that people can't post huge amount of data and exhaust all the memory on the server. However I looked pretty much everywhere I could think of (Pyramid, WebOb, Paster) and couldn't find any option to accomplish this. I've seen that Paster has limits for the number of HTTP headers, length each header, etc., but I didn't see anything for the size of the request body.

The server will be accepting POST requests only for JSON-RPC, so I don't need to allow huge request body sizes. Is there a way in the Pyramid stack of accomplishing this?

Just in case this is not obvious from the rest, a solution which has to accept and load the whole request body into memory before checking the length and returning a 4xx error code defeats the purpose of what's I'm trying to do, and is not what I'm looking for.

Christian Hudon
  • 1,881
  • 1
  • 21
  • 42
  • Right now for testing, the web server is "paster serve". However I couldn't find anything for that in its documentation. I also thought there might be a way to this in the bowels of the Pyramid WSGI handing (if WSGI is used in streaming mode), but I'm not so sure about that one. But I was thinking there should be something for this (and a default value for it) in paster (which is a web server), which is why I was surprised I couldn't find it. – Christian Hudon Apr 19 '12 at 19:01
  • I don't see any configuration options for that kind of thing at all in the paster docs. If they have a forum or mailing list, you might want to ask there. Otherwise, you may need to dive into the source. – agf Apr 19 '12 at 19:09

2 Answers2

2

Not really a direct answer to your question. As far as I know, you can create a wsgi app that will load the request if the body is below the configuration setting you can pass it to the next WSGI layer. If it goes above you can stop reading and return an error directly.

But to be honest, I really don't see the point to do it in pyramid. For example, if you run pyramid behind a reverse proxy with nginx or apache or something else.. you can always limit the size of the request with the frontend server.

unless you want to run pyramid with Waitress or Paster directly without any proxy, you should handle body size in the front end server that should be more efficient than python.

Edit

I did some research, it isn't a complete answer but here is something that can be used I guess. You have to read environ['wsgi_input'] as far as I can tell. This is a file like object that receives chunk of data from nginx or apache for example.

What you really have to do is read that file until max lenght is reached. If it is reached raise an Error if it isn't continue the request.

You might want to have a look at this answer

Community
  • 1
  • 1
Loïc Faure-Lacroix
  • 13,220
  • 6
  • 67
  • 99
  • You're right. For deployment I will be handling this in the load-balancing proxy server. But I was a bit surprised not to find such a feature in Paster, at least. Thanks. – Christian Hudon Apr 24 '12 at 17:33
1

You can do it in a variety of ways here's a couple of examples. one using wsgi middleware based on webob(installed when you install pyramid among other things). and one that uses pyramids event mechanism

"""
restricting execution based on request body size
"""
from pyramid.config import Configurator
from pyramid.view import view_config
from pyramid.events import NewRequest, subscriber
from webob import Response, Request
from webob.exc import HTTPBadRequest
import unittest


def restrict_body_middleware(app, max_size=0):
    """
    this is straight wsgi middleware and in this case only depends on
    webob. this can be used with any wsgi compliant web
    framework(which is pretty much all of them)
    """
    def m(environ, start_response):
        r = Request(environ)
        if r.content_length <= max_size:
            return r.get_response(app)(environ, start_response)
        else:
            err_body = """
            request content_length(%s) exceeds
            the configured maximum content_length allowed(%s)
            """ % (r.content_length, max_size)
            res = HTTPBadRequest(err_body)
            return res(environ, start_response)

    return m


def new_request_restrict(event):
    """
    pyramid event handler called whenever there is a new request
    recieved

    http://docs.pylonsproject.org/projects/pyramid/en/1.2-branch/narr/events.html
    """
    request = event.request
    if request.content_length >= 0:
        raise HTTPBadRequest("too big")


@view_config()
def index(request):
    return Response("HI THERE")


def make_application():
    """
    make appplication with one view
    """
    config = Configurator()
    config.scan()
    return config.make_wsgi_app()


def make_application_with_event():
    """
    make application with one view and one event subsriber subscribed
    to NewRequest
    """
    config = Configurator()
    config.add_subscriber(new_request_restrict, NewRequest)
    return config.make_wsgi_app()


def make_application_with_middleware():
    """
    make application with one view wrapped in wsgi middleware
    """
    return restrict_body_middleware(make_application())



class TestWSGIApplication(unittest.TestCase):
    def testNoRestriction(self):
        app = make_application()
        request = Request.blank("/", body="i am a request with a body")
        self.assert_(request.content_length > 0, "content_length should be > 0")
        response = request.get_response(app)
        self.assert_(response.status_int == 200, "expected status code 200 got %s" % response.status_int)

    def testRestrictedByMiddleware(self):
        app = make_application_with_middleware()
        request = Request.blank("/", body="i am a request with a body")
        self.assert_(request.content_length > 0, "content_length should be > 0")
        response = request.get_response(app)
        self.assert_(response.status_int == 400, "expected status code 400 got %s" % response.status_int)

    def testRestrictedByEvent(self):
        app = make_application_with_event()
        request = Request.blank("/", body="i am a request with a body")
        self.assert_(request.content_length > 0, "content_length should be > 0")
        response = request.get_response(app)
        self.assert_(response.status_int == 400, "expected status code 400 got %s" % response.status_int)



if __name__ == "__main__":
    unittest.main()
Tom Willis
  • 5,250
  • 23
  • 34
  • 1
    Thanks, this great! I've been meaning to learn about enough about WSGI to write middleware, etc. and this is a nice, easy to read example. Two quick questions though. First, is there something in WSGI, WebOb, etc. that ensures that the content-length header is always set? (My understanding is that HTTP clients won't necessarily always set this.) Second, has the whole of the request body been read in memory by the time this WSGI middlware is called? – Christian Hudon Apr 24 '12 at 17:32
  • 2
    You could write middleware to check for the existence of any header if you wanted to. I would guess that webob probably returns None if it wasn't sent. On your second question, I'm not sure if it will read it into memory or not. I think those low level details would be for the front end server like apache or nginx(which is why I up voted Loic's answer :) ). wsgi's job is handling the communication between the web server and the python code, webobs job is to represent and help you build http request and responses. – Tom Willis Apr 24 '12 at 17:52