1

I'm writing a Node.js PUT endpoint to allow users to upload large files. As I test the server via a cURL command, I'm finding that the entire file is 'uploaded' before my Node.js request fires:

cURL command

cat ./resource.tif \
  | curl \
    --progress-bar \
    -X PUT \
    --data-binary @- \
    -H "Content-Type: application/octet-stream" \
    https://server.com/path/to/uploaded/resource.tif \
      | cat

Testing, I know that https://server.com/path/to/uploaded/resource.tif already exists. In my Node.js code I test for this and respond with 409:

if (exists) {
  const msg = 'Conflict. Upload path already exists'
  res.writeHead(409, msg)
  res.write(msg)
  res.end()
  return
}

I'm finding that the response is only sent after the entire file has been uploaded. But I'm not sure if the file is buffering on the client side (i.e. cURL), or on the server side.

In any case... How do I configure cURL to pass the file stream to Node.js without buffering?

Other question/answers that I have seen - for example this one (use pipe for curl data) use the same approach as piping output of cat, or something similar to the argument for --binary-data. But this still results in the whole file processed before I see the conflict error.

Using mbuffer, as mentioned in https://stackoverflow.com/a/48351812/3114742:

mbuffer \
  -i ./myfile.tif \
  -r 2M \
    | curl \
      --progress-bar \
      --verbose \
      -X PUT \
      --data-binary @- \
      -H "Content-Type: application/octet-stream" \
      http://server.com/path/to/myfile.tif \
        | cat

This clearly shows that cuRL is only executing the request once the entire file contents have been read into memory on the local machine.

Zach Smith
  • 8,458
  • 13
  • 59
  • 133
  • I did a quick test uploading a file to a plain Node.js `http` server that closes the connection early, and cURL tells me that it got an _"HTTP error before end of send, stop sending"_. Also, it's the server's task to handle connections, so _it_ will have to decide if it accepts the full upload before generating the response. – robertklep Jul 22 '22 at 05:57

1 Answers1

1

curl will exit when it receives the 409 response and the response is ended, at least in my testing.

What allows curl to start the upload, is that the request includes the header Expect: 100-continue which causes node http(s) to use the default checkContinue handler. That responds to the client with a HTTP/1.1 100 Continue and curl continues.

To stop a client from starting the upload, handle a request with Expect: 100-continue via the checkContinue event:

server.on('checkContinue', (req, res) => {
  console.log('checkContinue', req.method, req.url)
  res.writeHead(409, {'Content-Type':'text/plain'})
  res.end('Nope')
})

nginx

The flow you want from nginx can be acheived with proxy_request_buffering off;:

1   client > proxy  : PUT /blah
2a  proxy  > client : 100 continue
2b  proxy  > app    : PUT /blah

3a  client > proxy  : start PUT chunks
3b  app    > proxy  : 409/close

4  proxy  > client : 409/close  
5  client bails with error

The 409/close to the client should only be in the milliseconds range behind the 100 continue in normal operation (or whatever the normal latency for this apps responses are).

The flow nginx provides with request buffering is:

1  client > proxy  : PUT /blah
2  proxy  > client : 100 continue
3  client > proxy  : PUT _all_ chunks
4  proxy  > app    : PUT /blah with all chunks
5  app    > proxy  : 409/close
6  proxy  > client : 409/close  
7  client completes with error
Matt
  • 68,711
  • 7
  • 155
  • 158
  • This does solve the problem on localhost. I can see that the request stops. However when deployed, curl still uploads the whole file before getting 409. Is there anything that I should do for Nginx proxy settings? – Zach Smith Jul 22 '22 at 07:56
  • Interesting. It looks like that is spec behaviour for a http proxy to handle the `100` while waiting for a response, so might be hard to disable. – Matt Jul 23 '22 at 02:10
  • I imagine you are only looking at the amount of data that can be transferred between nginx accepting the connection and getting the 409 response from the app server, which is not going to be huge amounts. That is probably is linked to the nginx buffer sizes. – Matt Jul 23 '22 at 02:10
  • Unless you have something further in the app delaying the regular 409 response until the upload is completed, I was thinking some connect middleware handling the upload first before calling `next()` originally – Matt Jul 23 '22 at 02:12
  • Thanks @Matt, is it possible that cURL starts uploading as it's waiting for the Expect response, and then just continues streaming the file once it's started? The cURL command only exits once the full file is processed, which in this case is 2.4GB. – Zach Smith Jul 23 '22 at 04:54
  • I've set the Nginx proxy `client_max_body_size` value to `8G`. This doesn't have anything to with caching does it? – Zach Smith Jul 23 '22 at 04:55
  • As far as I know, I'm only handling the res body after checking that the request is both to a non-existent resource, and is authenticated. https://github.com/SAEON/mnemosyne/blob/stable/src/server/routes/put/index.js#L29-L79 – Zach Smith Jul 23 '22 at 04:58
  • 1
    The app looks fine. I think you need to disable the buffering in nginx if you want the app to respond otherwise nginx slurps it all up. `proxy_request_buffering off;` does that, but not sure if that's useful setting for uploads generally. – Matt Jul 23 '22 at 10:26
  • Thank you - seems like that's the answer. This is useful https://serverfault.com/questions/741610/what-is-the-difference-between-proxy-request-buffering-and-proxy-buffering-on-ng – Zach Smith Jul 24 '22 at 07:14