I have an ASP.NET Core 3.1 web service that runs on Kestrel, and I want to accept uploads of large files without caching them. The service will simply pass the file data on to another server. This service is running on a node with very limited resources, so if I am uploading a file that is hundreds of MB or larger, it is physically impossible to cache it in memory or on disk.
In my testing, ASP.NET Core seems to insist on caching the entire request body before any aspect of it is passed on to my controller. E.g. if I write a client to upload a 100MB file at 5MB/sec, then the controller's route is only invoked about 20 seconds after the request starts -- and, after the requester has closed the request stream.
This occurs whether the request is using SSL or not. I have been using unencrypted requests as a baseline, to reduce the number of moving parts. I've done some reading that suggests that there might be another problem that comes up with the use of SSL, where it wants to buffer the entire request in order to do SSL certificate negotiation. I'll cross that bridge when I come to it; for now, that's definitely not what's going on because I'm testing with an unencrypted HTTP endpoint.
How can I have the controller begin processing the upload immediately, without it having to find a place server-side to store the entire file before any of my code starts to run?
UPDATE: Could the caching be taking place in the client side? I found HttpWebRequest.AllowWriteStreamBuffering
, and it defaults to true
. But, even after setting it to false
I still see buffering taking place. There doesn't seem to be buffering taking place if I use e.g. Postman to deliver a very large request. Hmm...
UPDATE 2: .NET Core simply doesn't implement AllowWriteStreamBuffering
. It always buffers requests. Giant sigh.
UPDATE 3: On .NET Framework, HttpClient
is implemented by means of HttpWebRequest
. On .NET Core (and .NET 5/6), HttpWebRequest
is implemented by means of HttpClient
. It turns out that this latter makes it very difficult to implement streaming of non-seekable Stream
s, so the HttpWebRequest
implementation in .NET Core always buffers. But, the HttpClient
implementation does not impose buffering on you. So, just use HttpClient
and things'll work properly everywhere.
I observed the pathological buffering in a web service that receives requests and forwards them on, and I've positively confirmed that the buffering was happening in the forwarding code that was acting as a client, and not in the ASP.NET Core front end. Further, I have managed to work around the problem, though it required a fair bit of code. The code is specific to the way the outbound requests are being made in the service -- they're uploads to Azure Blob Storage using BlobClient
.
To avoid caching if you have a non-seekable stream, this is what I came up with:
I created a subclass of
Stream
that maintains a ringbuffer of the last N bytes at all times, and allows rewinding into that buffer, such that reads will read out of the buffer until they catch back up with the head. The underlying stream is treated as non-seekable.I used this in the call to
BlobClient.UploadAsync
along withBlobUploadOptions
that fully-populateTransferOptions
. The properties inTransferOptions
allow configuring a "chunk" size (InitialTransferSize
must be set, and to ensure linear reading,MaximumConcurrency
must be 1). If an error occurs andUploadAsync
needs to do a retry, it only needs to rewind to the start of the current chunk, which in turn means that the ringbuffer only needs to be one chunk in size.
With this extensive workaround, I can now have an ASP.NET service that receives a file via HTTP PUT or POST and streams it directly into Azure Blob Storage without any intermediate buffering, allowing this service to be run on a resource-constrained node.
I am voting to close this question now, as it is not a relevant question.
UPDATE 4: BlobClient.UploadAsync
will automatically chop the stream up into independently-buffered chunks if you use a BlobUploadOptions
with a TransferOptions
that has both InitialTransferSize
and MaximumTransferSize
populated. You can use MaximumConcurrency
and your memory usage will be MaximumConcurrency * MaximumTransferSize
across MaximumConcurrency
separate buffers. This is way simpler than what I had been doing. You just need to assign both of those options, otherwise it'll always buffer the entire stream before it starts the request (if CanSeek
is false).