3

I need to save data into an Excel file(.xlsx). I decided to use "ClosedXML" NuGet (v0.95.3) (https://www.nuget.org/packages/ClosedXML/) for implementing it, after getting several recommendations on this NuGet from different developers.

Recently I encounter a problem when exporting 50k rows:

  • The saving process alone (after all the data was already added to the file) taking a very long time around ~10 seconds
  • This is without mention that the styling of these rows: font, borders, adjust columns width etc. is taking around ~12 seconds.
  • This without mentioning the fact which its taking me an around ~20 to retrieve and write the data into the file, so all the process in such case taking me ~45 seconds Total (too way much time!!).

I'm saving the data into a given stream using the "SaveAs()" method, and I already used the "XLEventTracking.Disabled" optimization.

Probably, I'm not the first one which dealing with Excel files, so:

  1. Is anyone of you is familiar with "ClosedXML" NuGet and encounter such problem in the past?
  2. Are you using a different NuGet for Excel files? (even if it cost money).

Thanks in advance!

Nuriel Zrubavely
  • 463
  • 3
  • 13
  • 1
    A xlsx file is just compressed xml. You can create a two rows Excel sheet, decompress it, look at the xml and simply generate a text file with the 50,000 rows and compress it. – Tarik Sep 04 '20 at 11:20
  • 2
    ClosedXML keeps an in-memory model of all cells and writes it out to file only when you save. The benefit is that this enables a very useful API to manipulate your spreadsheet, but the disadvantage is the memory footprint. If you just want to dump data into a file, I'd look at other tools that write to file. (I'm a ClosedXML maintainer). – Francois Botha Sep 09 '20 at 08:46

1 Answers1

3

I have used ClosedXML for a few years, and I have to admit that the library - as of version 0.95.1 - is pretty much unusable for larger Excel reports. The memory usage/footprint is a disaster. A really large report can easily allocate several gigabytes of RAM.

The performance problems you have seen are linked to GC (garbage collection). Looking at the code, you quickly realize that fixing this will require several iterations of improvements.

I would recommend looking at other libraries. Personally, I prefer to use the native libxlsxwriter library. You can easily integrate with it, using DllImport interop. For large reports, consider the Constant Memory mode. With constant memory, it outperforms ClosedXML in terms of both CPU and RAM.

l33t
  • 18,692
  • 16
  • 103
  • 180
  • ClosedXML maintainer here. I'm actually surprised that you say the memory usage as regressed. We've been working on it and round about 0.93 there was a major improvement. If you have a specific use case where it regressed in 0.95.1, please log it on the repo. – Francois Botha Sep 09 '20 at 08:47
  • For sure, 0.95.1 was an improvement in many ways. But not enough. Just create a sheet with 100 columns and 100,000+ rows. In my tests, we easily hit 6+ GB RAM usage. With `libxlsxwriter` we kept the RAM usage at 20 MB or so. – l33t Sep 09 '20 at 11:27
  • I'm having the same issue with 0.96. Building a spreadsheet with around 100k rows understandably takes a significant time, but it's the SaveAs call that's the real killer and takes up the majority of both time and system resources. – Petter Brodin Nov 03 '22 at 13:17