7

I have a C# program that is pulling in a .csv file that is approx. 42,000 lines long. All data in the file is stored as follows:

Zipcode,City,State

I am pulling all the information into three different columns in a listview.

Currently this data takes about 30 - 50 seconds to be brought into my program. My question is how can I better optimize my code to get this time down?

The following is a snippet of my code. The commented code is code I previously tried, but had no success in reducing the time, therefore I rewrote it in a way that was easier to read.

 //These are globally declared.
lvZip.Columns.Add("Zipcode", 150, HorizontalAlignment.Left);
lvZip.Columns.Add("City", 150, HorizontalAlignment.Left);
lvZip.Columns.Add("State", 150, HorizontalAlignment.Left);
lvZip.View = View.Details;

lvZip.Items.Clear();

        //string dir = System.IO.Path.GetDirectoryName(
        //  System.Reflection.Assembly.GetExecutingAssembly().Location);

        //string path = dir + @"\zip_code_database_edited.csv";
        //var open = new StreamReader(File.OpenRead(path));

        //foreach (String s in File.ReadAllLines(path))
        //{
        //    Zipinfo = s.Split(',');
        //    Zipinfo[0] = Zipinfo[0].Trim();
        //    Zipinfo[1] = Zipinfo[1].Trim();
        //    Zipinfo[2] = Zipinfo[2].Trim();
        //    lvItem = new ListViewItem(Zipinfo);
        //    lvZip.Items.Add(lvItem);
        //}
        //open.Close();

        StreamReader myreader = File.OpenText(path);
        aLine = myreader.ReadLine();

        while (aLine != null)
        {
            Zipinfo = aLine.Split(',');
            Zipinfo[0] = Zipinfo[0].Trim();
            Zipinfo[1] = Zipinfo[1].Trim();
            Zipinfo[2] = Zipinfo[2].Trim();
            lvItem = new ListViewItem(Zipinfo);
            lvZip.Items.Add(lvItem);
            aLine = myreader.ReadLine();
        }
        myreader.Close();
Mark P.
  • 1,827
  • 16
  • 37
  • The 3 answers so far are good. You might want to try the ListView property DoubleBuffered, though it probably won't make a difference if already using BeginUpdate+EndUpdate. You could make the ListView _very_ fast (e.g. under 1 sec to load only 1 'page' of data) with VirtualMode if you think this approach is worth the extra effort. – groverboy Dec 05 '13 at 02:14
  • Using `BeginUpdate()` and `EndUpdate()` will give the more *visible* performance boost. But the `AddRange()` approach suggested by Tweety will add further optimization to your code. And although the performance boost might be less notable like `AddRange()` - the `TextFieldParser` class was made specially to handle structured text files like CSV. – Derek W Dec 05 '13 at 03:03

4 Answers4

6

What you should be doing is using the ListView.BeginUpdate() and the ListView.EndUpdate() before and after you add anything into the ListView. The second thing would be to use the ListView.AddRange() instead of ListView.Add(). By using the Add method, you will redraw the ListView every time you use it. However, using ListView.AddRange() you will only redraw it once. That should optimize it a little for you.

Tweety
  • 192
  • 1
  • 10
2

You can try:

lvZip.BeginUpdate();

before you start adding all the items.

Then:

lvZip.EndUpdate();

when you've finished.

This will prevent the control from drawing each item as you add it, which makes the whole process very slow.

Baldrick
  • 11,712
  • 2
  • 31
  • 35
  • This is why I love stackoverflow. Clear concise answers. It helped. Loading took about 5 seconds. Any other suggestions? And Thanks! – Mark P. Dec 05 '13 at 00:54
  • 1
    Making it much faster could be tricky. Here's a similar question with other suggestions and timings: http://stackoverflow.com/questions/9008310/how-to-speed-adding-items-to-a-listview – Baldrick Dec 05 '13 at 00:55
  • Thanks. I didn't mean to dupe an already posted question. I did some research and couldn't find that particular one! – Mark P. Dec 05 '13 at 00:56
  • No worries. I think 5 secs doesn't sound too bad for your situation. – Baldrick Dec 05 '13 at 01:16
2

Golden Rule: Don't use String.Split() to read CSV data.

The .NET Framework already has a built-in dedicated CSV parser called TextFieldParser.

It's located in the Microsoft.VisualBasic.FileIO namespace.

Not only are there many edge cases that String.Split() is not properly equipped to handle, but it's also much slower to use StreamReader.

One last remark: A tip is to use using statements in order to ensure that your disposable objects get disposed (release unmanaged resources). I see that you are not using them (pun not intended) in the above code.

It's actually not that far outside the scope of this question since efficient memory management can boost the performance of your code.

Derek W
  • 9,708
  • 5
  • 58
  • 67
  • What is a helpful class like TextFieldParser doing in a VB namespace? Im new to C# so I have never heard of it but from my research it looks easy enough. Thanks for the tip! – Mark P. Dec 05 '13 at 15:45
0

A little more work perhaps but using a DataGridView with the textfile used as a datasource, you can get a load time in under 2 secs., from a 42,000 line .csv. Here's some code to look at:

    private void button2_Click(object sender, EventArgs e)
    {
        string errorInfo = String.Empty;
        //open text file into Dataset:
        string textFilePath = @"textfile1.csv";

        DataSet dataTextFile = new DataSet("textfile");
        if(!LoadTextFile(textFilePath, dataTextFile, out errorInfo))
        {
            MessageBox.Show("Failed to load text file:\n" + errorInfo,
                "Load Text File");
            return;
        }
        dgTextFile.DataSource = dataTextFile.Tables[0];
        dataTextFile.Dispose(); 
    }

    private bool LoadTextFile(string textFilePath, DataSet dataToLoad, out string errorInfo)
    {
        errorInfo = String.Empty;

        try
        {
            string textFileFolder = (new System.IO.FileInfo(textFilePath)).DirectoryName;
            string textConnectionString = @"Provider=Microsoft.Jet.OLEDB.4.0;" +
                                            "Data Source=" + textFileFolder + ";" +
                                            "Extended Properties=\"text;\";";
            OleDbConnection textConnection = new OleDbConnection(textConnectionString);

            textConnection.Open();

            textFilePath = (new System.IO.FileInfo(textFilePath)).Name;
            string selectCommand = "select * from " + textFilePath;

            //open command:
            OleDbCommand textOpenCommand = new OleDbCommand(selectCommand);
            textOpenCommand.Connection = textConnection;

            OleDbDataAdapter textDataAdapter = new OleDbDataAdapter(textOpenCommand);

            int rows = textDataAdapter.Fill(dataToLoad);

            textConnection.Close();
            textConnection.Dispose();

            return true;
        }
        catch(Exception ex_load_text_file)
        {
            errorInfo = ex_load_text_file.Message;
            return false;
        }
    }

Some of this code is from an MSDN sample but I can't seem to find the page.

tinstaafl
  • 6,908
  • 2
  • 15
  • 22