4

I have an ASP.NET Core app. There is a report created from data in DB that can be potentially several gigabytes in size.

Is it possible to stream data as a response from WebAPI in chunks, so that memory usage on server is minimized?

A very rough idea of what I want is

[HttpGet]
public IActionResult HugeReport()
{
    Response.ContentType = "text/plain";
    StreamWriter sw = new StreamWriter(Response.Body);
    var doc = new SpreadsheetDocument();
    for (int i = 0; i < 100; i++)
    {
        var data = Db.Skip(i * 1000).Take(1000);
        doc.WorkbookPart.AddData(data);

    }
}

Or, if this approach is not possible, can I construct a file in file system in chunks, iteration by iteration (without loading whole document to memory)?

Anarion
  • 2,406
  • 3
  • 28
  • 42
  • I still couldn't understand your requirement clearly. What you mean construct file in chunks? Do you mean you want to modify the xml documents into chunks into memory? But this is no difference between load whole document into the memory. – Brando Zhang Nov 19 '21 at 06:56
  • @BrandoZhang, no I mean the way that you can stream an infinite string without exhausting all memory - is it possible to stream an XML document in the same manner (or write it to disk). I.e. get set of data -> append it to document -> free memory via streaming it to client (or writing to disk) – Anarion Nov 19 '21 at 07:37
  • See https://www.tpeczek.com/2021/07/aspnet-core-6-and-iasyncenumerable.html (.NET 6 only) – haim770 Nov 25 '21 at 08:14
  • OpenXML files aren't simply XML files; they're ZIP files which contain multiple XML files in a specific structure. With the right zip library, you *may* be able to stream that ZIP file without storing everything in memory, at the expense of disabling compression. But AFAIK, the OpenXML SDK doesn't provide support for that, so you'd need to construct the document manually. – Richard Deeming Nov 25 '21 at 09:55
  • 1
    *NB:* If you're going to open the file in Excel, be aware of the [Excel file limits](https://support.microsoft.com/en-us/office/excel-specifications-and-limits-1672b34d-7043-467e-8e27-269d656771c3) - you can't have more than `1_048_576` rows and `16_384` columns. – Richard Deeming Nov 25 '21 at 09:58
  • 1
    It depends if your file is updated on a regular basis. If this is the case I'm not sure what you want to do is the best solution. Else you can still use a [FileStream](https://learn.microsoft.com/en-us/dotnet/api/system.io.filestream?view=net-6.0), sending to the client some amount of data and using pagination. The client call the API providing total amount of data received (meaning the stream position for the file). The client / server repeat until end of stream. If the file has been updated in between, client file will be certainly corrupted. – Hazrelle Nov 25 '21 at 18:33
  • Who is a consumer of your API? @Anarion – Andriy Shevchenko Nov 27 '21 at 15:13
  • @AndriyShevchenko - browser user – Anarion Nov 27 '21 at 15:51
  • Maybe you could paginate the report instead? – Ryan Wildry Dec 01 '21 at 19:03

0 Answers0