0

we use Nginx + Lua and want to support chunked upload according this workaround which is generally works. My question is how I can process upload request as usual - work with headers, body, eof:

                local form, err = upload:new(chunk_size)
                if not form then
                    ngx.log(ngx.ERR, "failed to new upload: ", err)
                    ngx.exit(500)
                end

                form:set_timeout(1000) -- 1 sec

                while true do
                    local typ, res, err = form:read()
                    if not typ then
                        ngx.say("failed to read: ", err)
                        return
                    end

                    ngx.say("read: ", cjson.encode({typ, res}))

                    if typ == "eof" then
                        break
                    end
                end

and just when I have chunked upload header -H "Transfer-Encoding: chunked" use that chunk script.

Sorry if it is something obvious, but after couple days of googling I don't see any example. But my suggesting is:

# read headers
ngx.req.get_headers()

#read body:
ngx.req.get_body_data()

and then I don't need form:read() and iterate over form array until eof. Appreciate any links, examples.

curl example:

curl -X PUT localhost:8080/test -F file=@./myfile -H "Transfer-Encoding: chunked"
Sergii Getman
  • 3,845
  • 5
  • 34
  • 50
  • 1
    To make things clear – is chunked transfer encoding used _with_ form-data (I mean, first an uploaded file is encoded using multipart/form-data and then a request body, that is, form, is encoded using chunked transfer encoding) or is chunked transfer encoding used with a raw file bogy (i.e., without form-data)? – un.def May 29 '20 at 17:58
  • it's just one file request with header `-H "Transfer-Encoding: chunked"` right now – Sergii Getman Jun 01 '20 at 07:41
  • 1
    So, depending on request headers you want to decode a chunked body and treat it as an uploaded file (if `transfer-encoding: chunked` header is present) or extract an uploaded file from some part of `multipart/form-data` (if `content-type: multipart/form-data` is present), do I understand correctly? – un.def Jun 01 '20 at 08:15
  • we load just one file in _multipart_ way and want to send it by chunk(`-H "Transfer-Encoding: chunked"`) – Sergii Getman Jun 01 '20 at 10:23
  • so literally I need to do both – Sergii Getman Jun 01 '20 at 10:23
  • 1
    Hmm, I need to clarify again. Your HTTP client sends one file as a part of multipart/form-data form, and the request body (that is, the form itself) is encoded usigng chunked encoding. Thus, the client should send the following two headers: `content-type: multipart/form-data` and `transfer-encoding: chunked`. Is it correct? – un.def Jun 01 '20 at 12:18
  • yes, correct! in terms of _curl_ something like `curl -X PUT localhost:8080/test -F file=@./myfile -H "Transfer-Encoding: chunked"` – Sergii Getman Jun 01 '20 at 12:25

2 Answers2

1

Unfornunately, ngx.req.socket (https://github.com/openresty/lua-nginx-module#ngxreqsocket), which is used by lua-resty-upload under the hood, does not handle body encodings at the moment. That is, when you read from the socket object, you'll receive request body as is, therefore, you need to decode it by yourself. lua-resty-upload doesn't do it, it expects a plain formdata body without any additional encoding. See https://github.com/openresty/lua-resty-upload/issues/32#issuecomment-266301684 for further explanation.

As it mentioned at the link above, you can use ngx.req.read_body/ngx.re.get_body_data that are backed by “nginx's built-in request body reader with chunked encoding support”. The ngx.re.get_body_data method returns an already decoded body. You can feed the body to some formdata parser that accepts a body as a byte string rather than reads it from the cosocket (as lua-resty-upload does). For example, you can use lua-resty-multipart-parser: https://github.com/agentzh/lua-resty-multipart-parser

There is a significant downside – the request body need to be read to Lua string at once, that is, the whole request body is stored in memory as Lua string object.

Theoretically, it could be fixed. We can modify lua-resty-upload to accept a socket-like object instead of hardcoded one (https://github.com/openresty/lua-resty-upload/blob/v0.10/lib/resty/upload.lua#L60) and write some sort of buffer that lazily reads bytes from an iterator and provides the socket-like interface. Maybe I'll try it later.


Here is the example of usage both libraries. It does exactly what you asked for (but remember, it reads the whole body to string if request body is chunked–encoded).

# nginx.conf
http {
    server {
        listen 8888;
        location = /upload {
            content_by_lua_block {
                require('upload').handler()
            }
        }
    }
}
-- upload.lua
local upload = require('resty.upload')
local multipart_parser = require('resty.multipart.parser')

local get_header = function(headers, name)
    local header = headers[name]
    if not header then
        return nil
    end
    if type(header) == 'table' then
        return header[1]
    end
    return header
end

local handler = function()
    -- return 405 if HTTP verb is not POST
    if ngx.req.get_method() ~= 'POST' then
        return ngx.exit(ngx.HTTP_NOT_ALLOWED)
    end
    local headers = ngx.req.get_headers()
    local content_type = get_header(headers, 'content-type')
    -- return 400 if the body is not a formdata
    if not content_type or not string.find(content_type, '^multipart/form%-data') then
        return ngx.exit(ngx.HTTP_BAD_REQUEST)
    end
    local transfer_encoding = get_header(headers, 'transfer-encoding')
    if transfer_encoding == 'chunked' then
        -- parse form using `lua-resty-multipart-parser`
        ngx.say('*** chunked')
        -- read the body, chunked encoding will be decoded by nginx
        ngx.req.read_body()
        local body = ngx.req.get_body_data()
        if not body then
            local filename = ngx.req.get_body_file()
            if not filename then
                return ngx.exit(ngx.HTTP_INTERNAL_SERVER_ERROR)
            end
            -- WARNING
            -- don't use this code in production, file I/O is blocking,
            -- you are going to block nginx event loop at this point!
            local fd = io.open(filename, 'rb')
            if not fd then
                return ngx.exit(ngx.HTTP_INTERNAL_SERVER_ERROR)
            end
            body = fd:read('*a')
        end
        local parser = multipart_parser.new(body, content_type)
        while true do
            local part = parser:parse_part()
            if not part then
                break
            end
            ngx.say('>>> ', part)
        end
    else
        -- parse form using `lua-resty-upload` (in a streaming fashion)
        ngx.say('*** not chunked')
        local chunk_size = 8 -- for demo purposes only, use 4096 or 8192
        local form = upload:new(chunk_size)
        while true do
            local typ, res = form:read()
            if typ == 'eof' then
                break
            elseif typ == 'body' then
                ngx.say('>>> ', res)
            end
        end
    end
end

return {
    handler = handler
}
$ curl -X POST localhost:8888/upload -F file='binary file content'                             

*** not chunked
>>> binary f
>>> ile cont
>>> ent

As you can see, the body is read and processed chunk by chunk.

$ curl -X POST localhost:8888/upload -F file='binary file content' -H transfer-encoding:chunked

*** chunked
>>> binary file content

Here, conversely, the body is processed at once.

un.def
  • 1,200
  • 6
  • 10
  • it works fine with text files but when I send binary data I get error cause cannot get body: `local body = ngx.req.get_body_data()`, error: `parser.lua:58: bad argument #1 to 'find' (string expected, got nil)` – Sergii Getman Jun 02 '20 at 17:14
  • 1
    Yeah, you are right. `ngx.req.get_body_data` returns only in-memory buffer content, but if the body is too big for the buffer, it wil be stored in a temporary file. In this case `ngx.req.get_body_file` is used to obtain a filename. I've updated the example, now I/O Lua API is used to read the file. But be aware, this approach is even worse – file I/O is **blocking** in most cases. That is, you'll block nginx event loop. This can be avoided by offloading I/O operations to thread pool, e.g., https://github.com/tokers/lua-io-nginx-module But it looks too complicated for such simple task, I think – un.def Jun 02 '20 at 18:09
  • hi @un.def I have successfully applied chunked upload patch with socket https://github.com/openresty/lua-nginx-module/blob/75cc29ea64a87bc5cd447525893fda76b8d664b4/t/116-raw-req-socket.t#L784 so now I have fully worked pipeline with your code and that patch – Sergii Getman Jun 04 '20 at 12:53
  • Did you mean that you decode chunked encoding by yourself, concatenate all decoded chunks to string and feed this string to `resty.multipart.parser`? BTW, you can use `get_client_body_reader` from [lua-resty-http](https://github.com/ledgetech/lua-resty-http#get_client_body_reader), is supports chunked encoding: https://github.com/ledgetech/lua-resty-http/blob/v0.15/lib/resty/http.lua#L985 – un.def Jun 04 '20 at 14:06
0

In the previous answer I noted:

We can modify lua-resty-upload to accept a socket-like object instead of hardcoded one and write some sort of buffer that lazily reads bytes from an iterator and provides the socket-like interface.

It's done. I've created a new library named lua-buffet. It can be used to create objects that act like regular ngx_lua cosocket objects. Not all socket methods are implemented yet but right now it has all methods required for lua-resty-upload. It is not released yet but I'm going to release the first version soon.

I also forked and modified lua-resty-upload to add the socket parameter. I'll create the PR to the upstream repository later.

There is an example of how to handle data in your case: https://github.com/un-def/lua-buffet/tree/master/examples/resty-chunked-formdata

un.def
  • 1,200
  • 6
  • 10