Other way to optimize this C# Code?

Question

I creating a conversion of mdb records to sql records, for me it is working fine.. but when the records much larger the speed is very slow.. I think there is other way to optimize the parsing and extracting of the records??

using System;
using System.Collections;
using System.Collections.Generic;
using System.IO;
using System.Text;
using System.Data;
using System.Data.OleDb;    

namespace conMDBToSQL
{
    class Program
    {
        static void Main()
        {

            // string sFileName = "C:\\sqlrecords.mdb";
            string sFileName = "C:\\sample.mdb";
            OleDbConnection _con = new OleDbConnection( @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + sFileName + ";Mode=Share Exclusive;User Id=admin;Password=;" );
            try
            {
                _con.Open();

                string sTableName = "users";

                string strSQL = String.Empty;

                strSQL = "SELECT * FROM " + sTableName + " ORDER BY 1";
                OleDbCommand _cmd = new OleDbCommand( strSQL, _con );
                _cmd.CommandType = CommandType.Text;
                OleDbDataReader _dr = _cmd.ExecuteReader( CommandBehavior.SchemaOnly );

                ArrayList _Arr = new ArrayList();


                DataTable dt = _dr.GetSchemaTable();
                foreach ( DataRow drow in dt.Rows )
                {
                    // Console.Write( drow[0].ToString() + "\n" );
                    _Arr.Add( drow[0].ToString() );
                }
                _dr.Close();
                _cmd.Dispose();


                strSQL = "SELECT * FROM " + sTableName + " ORDER BY 1";
                _cmd = new OleDbCommand(strSQL, _con);
                _cmd.CommandType = CommandType.Text;
                _dr = _cmd.ExecuteReader();

                string s_fields = String.Empty;
                string s_values = String.Empty;
                int arr_count = _Arr.Count - 1;
                while (_dr.Read()) 
                {
                    for(int i_a = 0; i_a <= arr_count;i_a++)
                    {
                        if (i_a <= arr_count)
                        {
                            s_fields += _Arr[i_a].ToString() + ",";
                            s_values += "'" + _dr[i_a].ToString() + "',";
                        }
                        else
                        {
                            s_fields += _Arr[i_a].ToString();
                            s_values += "'" + _dr[i_a].ToString() + "'";
                        }
                    }

                    // Build, query
                    strSQL = String.Empty;
                    strSQL = "INSERT INTO " + sTableName + "("+ s_fields +") VALUES ("+ s_values +")";

                    // Write, text file
                    StreamWriter sw = new StreamWriter(sFileName, true, Encoding.UTF8);
                    sw.Write(strSQL);
                    sw.Close();
                    sw = null;
                }


                Console.WriteLine("\n");
                Console.WriteLine("Finished..");
                Console.ReadKey();
            }
            catch ( OleDbException olex )
            {
                throw olex;
            }
            finally 
            {
                _con.Close();
            }

        }
    }
}

Thanks,

Seems like the best way to accomplish this task is really not to write any code at all. There are tools built into access to import this data into SQL server without a custom code solution: http://stackoverflow.com/questions/370619/how-to-import-mdb-to-sql-server — jdmcnair, May 16 '12 at 02:06
Nikhil, what is you scenario? We might be able to suggest even something more powerful and faster if we know what your requirements are. — Mohammed Ali, May 16 '12 at 02:26
use for instead foreach and don't use ToString() (_Arr.Add( drow[0].ToString() );) when using ArrayList, you can put object there. And yes, use StringBuilder it is much more efficiently. — Eugene Petrov, May 16 '12 at 02:39
I don't understand this condition if (i_a <= arr_count) in the for... you are already using this condition in for. Didn't you want to use != instead? If so, I suggest use i_a < arr_count and move out to process last member from loop. And of course, don't write to file in the loop, move it out. In addition, you can check amount of reacords and if there are huge records to use Parallel For instead during reading — Eugene Petrov, May 16 '12 at 02:47
@EugenePetrov: The condition is there to prevent adding a comma after the last item. Also, the place where he uses `foreach` is unlikely to be his bottleneck. Changing that code would make almost no difference in the run time. — Jim Mischel, May 16 '12 at 02:59
@JimMischel : He is using the same condition as in the for, there is no prevention here, when he get last element in i_a for check, if i_a <= arr_count(arr_count - the last index element, he must use <) is true it goes into the body, the if does the same. Am I not right? I suggested remove the condition from loop it makes it faster — Eugene Petrov, May 16 '12 at 05:10

score 1 · Answer 1 · answered May 16 '12 at 02:57

1

SqlBulkCopy class helped me once. Link here

answered May 16 '12 at 02:57

Val Bakhtin

1,434
9
11

score 0 · Answer 2 · answered May 16 '12 at 02:23

0

I think your major culprit is the new StreamWriter(sFileName, true, Encoding.UTF8) line of code. For each data row you are opening the file, adding the single command, and then closing the file. That would be very inefficient.

Try using a StringBuilder instead of writing to the stream and then, outside of the reading loop, write the string built using the StringBuilder to the file just once.

I think you'll find that this speeds things up enormously.

answered May 16 '12 at 02:23

Enigmativity

113,464
11
89
172

That'll work as long as the total amount of data he's writing doesn't exceed 1 billion characters. – Jim Mischel May 16 '12 at 02:28
@JimMischel - He could then do partial writes then, but I suspect that if that's a limit he's hitting then he might have many more problems. It'd be one very large table. – Enigmativity May 16 '12 at 02:34
A couple of gigabytes doesn't sound like a very large table to me. – Jim Mischel May 16 '12 at 02:41

score 0 · Accepted Answer · answered May 16 '12 at 02:39

I see two primary issues. First, you should open the StreamReader at the beginning of the loop, and close it after you're done. Like this:

using (StreamWriter sw = new StreamWriter(sFilename, false, Encoding.UTF8)
{
    while (dr.Read())
    {
        // convert here
        // Each line is written to the StreamWriter
    }
}

The using construct will ensure that the StreamWriter is properly closed and disposed.

The second issue is appending things to strings. Rather than appending strings, use StringBuilder:

StringBuilder s_fields = new StringBuilder();
StringBuilder s_values = new StringBuilder();
int arr_count = _Arr.Count - 1;
while (_dr.Read()) 
{
    for(int i_a = 0; i_a <= arr_count;i_a++)
    {
        if (i_a <= arr_count)
        {
            s_fields.AppendFormat("{0},", _Arr[i_a].ToString());
            s_values.AppendFormat("'{0}',", _dr[i_a].ToString());
        }
        else
        {
            s_fields.Append(_Arr[i_a].ToString());
            s_values.AppendFormat("'{0}'", _dr[i_a].ToString());
        }
    }
    strSQL = String.Format("INSERT INTO {0} ({1}) VALUES ({2})",
        sTableName, s_fields.ToString(), s_values.ToString());
    sw.WriteLine(strSQL);
}

That should improve your performance quite a bit.

Other way to optimize this C# Code?

3 Answers3