0

I would like to find out how to recover from Unique Key Violation error in DBML on insert.

My scenario:

  1. Create a new DBML mapped object (call it A) (possible to be a duplicate but cannot be determined before) and use InsertOnSubmit(A),
  2. Then in context I call SubmitChanges(ConflictMode.ContinueOnConflict)
  3. I catch the error and try to resolve all conflicts by OverwriteCurrentValues
  4. When I retry SubmitChanges I got the same error

Code:

var new_date_row = new dimension_date(); 
    // two columns: datetime date, identity key
new_date_row.Date = new DateTime(2014, 1, 1);
db_context.dimension_dates.InsertOnSubmit(new_date_row);

try
{
    db_context.SubmitChanges(ConflictMode.ContinueOnConflict);
}
catch (Exception e)
{
    Console.WriteLine(e.ToString());
}

db_context.ChangeConflicts.ResolveAll(RefreshMode.OverwriteCurrentValues);
db_context.SubmitChanges(); // exception thrown here

As a result these are never resolved when I have a duplicated date inserted and the same exception is being thrown on the second SubmitChanges() call.

How to recover from the error to successfully call SubmitChanges() ?

I am looking for a solution to use inside more complex ETL processing code. This should run for multiple rows and each row with ten or more dimensions before I submit changes. I dont mind unique constraint errors - they mean that values I want to insert are already in the db and I am happy with that. Next steps here are surrogate key substitution for each dimension and insert or update for the fact table (not included in this example).

user3218782
  • 111
  • 2
  • 7
  • Why don't you check db_context.dimension_dates prior to trying to add a new one? Then you could avoid the problem. If your intent is to update a row that contains the same PK, then update it rather than trying to insert new - or, delete the existing row first. – Mike Aug 01 '14 at 14:19
  • I intend to run this code in parallel on multiple instances therefore, unless it is locked inside a transaction, I can never be sure if another process did not insert a new dimension meanwhile – user3218782 Aug 01 '14 at 14:34
  • I found a partial solution - inserting code in catch statement: `foreach (var insert in db_context.GetChangeSet().Inserts) { db_ context.GetTable(insert.GetType()).DeleteOnSubmit(insert); }` this however will remove all changes and not only the one that failed - not good enough! – user3218782 Aug 01 '14 at 16:46

1 Answers1

0

The preferred solution to this would be:

        var date = new DateTime(2014, 1, 1);
        if (!db_context.dimension_dates.Any(x => x.Date == date))
        {
            var new_date_row = new dimension_date();
            // two columns: datetime date, identity key
            new_date_row.Date = date;
            db_context.dimension_dates.InsertOnSubmit(new_date_row);
            db_context.SubmitChanges();
        }
Rob Epstein
  • 1,450
  • 9
  • 11
  • if(!db_context.dimension_dates.Any(x=>x.Date == date)) would be better. It's inefficient to retrieve the value when it isn't needed. – Mike Aug 01 '14 at 14:32
  • Hi Rob, thanks for your time and answer, I should have added it is a part from much bigger ETL processor code, I cannot afford submitting changes with every dimension being updated. I plan to add between hundreds to thousands single changes at the time before submitting changes. – user3218782 Aug 01 '14 at 14:35
  • Then I would remove the SubmitChanges() call from the sample and call SubmitChanges() when your looping is complete or when you hit a configured batch size. Have you investigated SSIS for your ETL needs as opposed to custom C#? – Rob Epstein Aug 01 '14 at 14:38
  • Having seen your comment at the top, could you not parallelize based on destination entity to eliminate the concern of processing the same entity twice in two different threads? – Rob Epstein Aug 01 '14 at 14:39
  • I am re-writing existing SSIS as this became un-managable when related stored proc grew over 2k lines. C# based on custom linq queries and smart joins are also much faster. – user3218782 Aug 01 '14 at 14:44
  • Rob as far as I understand LINQ to SQL in C# I need to insert all dimension values first before I can start getting keys for substitutions is it right? I could not find any example of persistance layer for a simple star schema model. – user3218782 Aug 01 '14 at 14:46
  • If you have navigation properties on dependent entities then you can set those properties accordingly and L2S will ensure generated keys get set. – Rob Epstein Aug 01 '14 at 14:58
  • Thanks, I will try this, yes all dimensioned are FK'ed. – user3218782 Aug 01 '14 at 15:01
  • Rob, I dont think it will work, I need to find dimension object that matches my 'date' or create a new one, but then I risk an exception blowing out on Submit when dimensions dates are not unique. – user3218782 Aug 01 '14 at 15:27
  • Could you pre-process the dates by building up a list of distinct dates in the inbound data stream you're processing? That way you're guaranteed of their existence when looping through the related dimensions? – Rob Epstein Aug 01 '14 at 15:29
  • Hi Rob, I have many different dimensions as well, the same would apply there and I don't know the data beforehand. Is there any way to recover when you hit unique key constraint error? How to clear the context so I can add another value-key pair and call Submit without the old error being re-raised? – user3218782 Aug 01 '14 at 16:04
  • Not that I know of which is why I've been making process suggestions. – Rob Epstein Aug 01 '14 at 16:19