0

I am making use of SQL Server 2012 and have a huge file of approx 20 GB size. I want to insert every record inside file into database. I am using SqlBulkCopy class for this purpose. But since, the size of data is very huge I will have to insert it part by part. Here is the code:

String line;
SqlConnection conn = new SqlConnection(ConfigurationManager.ConnectionStrings["conStrtingName"].ConnectionString);
conn.Open();
StreamReader readFile = new StreamReader(filePath);
SqlTransaction transaction = conn.BeginTransaction();
try
{
    SqlBulkCopy copy = new SqlBulkCopy(conn, SqlBulkCopyOptions.KeepIdentity, transaction);
    copy.BulkCopyTimeout = 600;
    copy.DestinationTableName = "Txn";
    int counter = 0;
    while ((line = readFile.ReadLine()) != null)
    {
        string[] fields = line.Split('\t');
        if (fields.Length == 3)
        {
            DateTime date = Convert.ToDateTime(fields[0]);
            decimal txnCount = Convert.ToDecimal(fields[1]);
            string merchantName = fields[2];
            if (!string.IsNullOrEmpty(merchantName))
            {
                long MerchantId = Array.IndexOf(Program.merchantArray, merchantName) + 1;
                tables[workerId].Rows.Add(MerchantId, date, txnCount);
                counter++;
                if (counter % 100000 == 0)
                    Console.WriteLine("Worker: " + workerId + " - Transaction Records Read: " + counter);
                if (counter % 1000000 == 0)
                {
                    copy.WriteToServer(tables[workerId]);
                    transaction.Commit();
                    tables[workerId].Rows.Clear();
                    //transaction = conn.BeginTransaction();
                    Console.WriteLine("Worker: " + workerId + " - Transaction Records Inserted: " + counter);
                }
            }
        }
    }
    Console.WriteLine("Total Transaction Records Read: " + counter);
    if (tables[workerId].Rows.Count > 0)
    {
        copy.WriteToServer(tables[workerId]);
        transaction.Commit();
        tables[workerId].Rows.Clear();
        Console.WriteLine("Worker: " + workerId + " - Transaction Records Inserted: " + counter);
    }
}
catch (Exception ex)
{
    Console.WriteLine(ex.Message);
    transaction.Rollback();
}
finally
{
    conn.Close();
}

It works for first 100000 records. However for the next set of records I get an exception The transaction is either not associated with the current connection or has been completed.

This happens when the control reaches to the transaction.Commit(); for the next set of records.

Can I have a workaround?

Andy
  • 3,997
  • 2
  • 19
  • 39
Nilesh Barai
  • 1,312
  • 8
  • 22
  • 48
  • This might be helpful: http://stackoverflow.com/questions/11453066/error-the-transaction-associated-with-the-current-connection-has-completed-but – Mahmoud Gamal Oct 20 '15 at 10:13
  • Did you check the memory usage of your application in the task manager? – Mahmoud Fayez Oct 20 '15 at 10:17
  • https://www.simple-talk.com/sql/database-administration/sql-server-database-growth-and-autogrowth-settings/ – Mahmoud Fayez Oct 20 '15 at 10:17
  • @MahmoudGamal: in that link, there are more than 1 bulkCopy objects are created. Is it like using 1 bulk copy object only 1 transaction is allowed? Since we commit the transaction the transaction is complete for the first batch. So may be for the next batch it identifies that the transaction is already complete? – Nilesh Barai Oct 20 '15 at 10:17
  • Have you picked the right tool for the job? SQL Server has features (such as `BULK INSERT` or SSIS) which can take an input file, process it, and place the data into a table, taking care of all of the buffering, etc requirements so that you don't have to. Why is c# necessarily part of the solution here? – Damien_The_Unbeliever Oct 20 '15 at 10:49

3 Answers3

2

The problem is the commented line after the transaction is commit. You need to uncomment it, and also reinitialize your SqlBulkCopy copy variable. You'd better refactor your code, the only places where you need transaction and copy object is when you flush the data table that you are filling, like this (you can further factor out the repetitive part into a separate method):

String line;
SqlConnection conn = new SqlConnection(ConfigurationManager.ConnectionStrings["conStrtingName"].ConnectionString);
conn.Open();
StreamReader readFile = new StreamReader(filePath);
SqlTransaction transaction = null;
try
{
    int counter = 0;
    while ((line = readFile.ReadLine()) != null)
    {
        string[] fields = line.Split('\t');
        if (fields.Length == 3)
        {
            DateTime date = Convert.ToDateTime(fields[0]);
            decimal txnCount = Convert.ToDecimal(fields[1]);
            string merchantName = fields[2];
            if (!string.IsNullOrEmpty(merchantName))
            {
                long MerchantId = Array.IndexOf(Program.merchantArray, merchantName) + 1;
                tables[workerId].Rows.Add(MerchantId, date, txnCount);
                counter++;
                if (counter % 100000 == 0)
                    Console.WriteLine("Worker: " + workerId + " - Transaction Records Read: " + counter);
                if (counter % 1000000 == 0)
                {
                    transaction = conn.BeginTransaction()
                    SqlBulkCopy copy = new SqlBulkCopy(conn, SqlBulkCopyOptions.KeepIdentity, transaction);
                    copy.BulkCopyTimeout = 600;
                    copy.DestinationTableName = "Txn";
                    copy.WriteToServer(tables[workerId]);
                    transaction.Commit();
                    transaction = null;
                    tables[workerId].Rows.Clear();
                    Console.WriteLine("Worker: " + workerId + " - Transaction Records Inserted: " + counter);
                }
            }
        }
    }
    Console.WriteLine("Total Transaction Records Read: " + counter);
    if (tables[workerId].Rows.Count > 0)
    {
        transaction = conn.BeginTransaction()
        SqlBulkCopy copy = new SqlBulkCopy(conn, SqlBulkCopyOptions.KeepIdentity, transaction);
        copy.BulkCopyTimeout = 600;
        copy.DestinationTableName = "Txn";
        copy.WriteToServer(tables[workerId]);
        transaction.Commit();
        transaction = null;
        tables[workerId].Rows.Clear();
        Console.WriteLine("Worker: " + workerId + " - Transaction Records Inserted: " + counter);
    }
}
catch (Exception ex)
{
    Console.WriteLine(ex.Message);
    if (transaction != null) transaction.Rollback();
}
finally
{
    conn.Close();
}

The problem thought is that now you cannot rollback ALL the changes in case something goes wrong. Probably the better solution would be to not manually splitting your bulk inserts, but use some sort of a IDataReader implementation to avoid populating a huge DataTable in memory (for instance using Marc Gravell's ObjectReader).

Community
  • 1
  • 1
Ivan Stoev
  • 195,425
  • 15
  • 312
  • 343
  • @NileshBarai You welcome. However after further thinking, I'm not quite sure it's a correct solution - yes, the exception is gone, but the behavior is different. See the last paragraph in the answer. I've provided it just FYI, It may or may not be a problem for you. Good luck. – Ivan Stoev Oct 20 '15 at 11:07
  • his solution looks good but it needs an IEnumerable collection of data to work. In my case, my system will throw OutOfMemory Exception since the data size is huge and I don't have that amount of RAM to accommodate everything in a single collection. As far as transaction rollback is concerned that's not a big problem. However when I executed my refactored code, it was able to insert almost 60 million records after which my system ran out of memory coz the size of SQL Server process exceeded beyond 10 GB. Looking for some other alternative.... – Nilesh Barai Oct 20 '15 at 14:04
  • @NileshBarai The last thing. If you don't care for rolling back the inserted data in case some error occurs, you can try completely removing the transaction which may reduce your server resources. See the thread [Does SqlBulkCopy automatically start a transaction?](http://stackoverflow.com/questions/26583432/does-sqlbulkcopy-automatically-start-a-transaction) – Ivan Stoev Oct 20 '15 at 15:00
1

Your transaction is committed every 100000 sets. So it is "gone", you have to start another one then with transaction = conn.BeginTransaction.

Maybe good to rework the code to better reflect the lifespan of the transaction then. You also might to make sure that "copy" is recreated with the new transaction.

Argeman
  • 1,345
  • 8
  • 22
  • so that means that I will also have to create a new SqlBulkCopy object since transaction is associated with it. Do I also need to reopen the connection? or the existing connection should work? – Nilesh Barai Oct 20 '15 at 10:25
  • The connection should work; at least in the cases where I did simmilar things it always worked – Argeman Oct 20 '15 at 10:26
0

You can increase the timeout for your transaction like this (use values appropriate for the expected length of your transaction). The code below is for 15 minutes: Source

using (TransactionScope scope = 
             new TransactionScope(TransactionScopeOption.Required, 
                                   new System.TimeSpan(0, 15, 0)))
  {
      // working code here
  }
Community
  • 1
  • 1
Krsna Kishore
  • 8,233
  • 4
  • 32
  • 48
  • transaction is not getting timed out as my first batch of 100000 is getting inserted. for the next batch however, when I try commit it throws an exception – Nilesh Barai Oct 20 '15 at 10:15