I have a .NET application written in C# (.NET 4.0). In this application, we have to read a large dataset from a file and display the contents in a grid-like structure. So, to accomplish this, I placed a DataGridView on the form. It has 3 columns, all column data comes from the file. Initially, the file had about 600.000 records, corresponding to 600.000 lines in the DataGridView.
I quickly found out that, DataGridView collapses with such a large data-set, so I had switch to Virtual Mode. To accomplish this, I first read the file completely into 3 different arrays (corresponding to 3 columns), and then the CellValueNeeded event fires, I supply the correct values from the arrays.
However, there can be a huge (HUGE!) number of records in this file, as we quickly found out. When the record size is very large, reading all the data into an array or a List<>, etc, appears to not be feasible. We quickly run into memory allocation errors. (Out of memory exception).
We got stuck there, but then realized, why read the data all into arrays first, why not read the file on demand as CellValueNeeded event fires? So that's what we do now: We open the file, but do not read anything, and as CellValueNeeded events fire, we first Seek() to the correct position in the file, and then read the corresponding data.
This is the best we could come up with, but, first of all this is quite slow, which makes the application sluggish and not user friendly. Second, we can't help but think that there must be a better way to accomplish this. For example, some binary editors (like HXD) are blindingly fast for any file size, so I'd like know how this can be achieved.
Oh, and to add to our problems, in virtual mode of the DataGridView, when we set the RowCount to the available number of rows in the file (say 16.000.000), it takes a while for the DataGridView to even initialize itself. Any comments for this 'problem' would be appreciated as well.
Thanks