15

I have an application running on nHibernate v4.0.4.4000 - it is running in production on three seperate webservers. For ID-generation, I'm using the default HiLo implementation (unique id across tables).

Sometimes, it generates duplicate Id's when saving new entities with the following stack-trace:

at NHibernate.AdoNet.SqlClientBatchingBatcher.DoExecuteBatch(IDbCommand ps)
at NHibernate.AdoNet.AbstractBatcher.ExecuteBatchWithTiming(IDbCommand ps)
at NHibernate.AdoNet.AbstractBatcher.ExecuteBatch()
at NHibernate.AdoNet.AbstractBatcher.PrepareCommand(CommandType type, SqlString sql, SqlType[] parameterTypes)
at NHibernate.AdoNet.AbstractBatcher.PrepareBatchCommand(CommandType type, SqlString sql, SqlType[] parameterTypes)
at NHibernate.Persister.Entity.AbstractEntityPersister.Insert(Object id, Object[] fields, Boolean[] notNull, Int32 j, SqlCommandInfo sql, Object obj, ISessionImplementor session)
at NHibernate.Persister.Entity.AbstractEntityPersister.Insert(Object id, Object[] fields, Object obj, ISessionImplementor session)
at NHibernate.Action.EntityInsertAction.Execute()
at NHibernate.Engine.ActionQueue.Execute(IExecutable executable)
at NHibernate.Engine.ActionQueue.ExecuteActions(IList list)
at NHibernate.Engine.ActionQueue.ExecuteActions()
at NHibernate.Event.Default.AbstractFlushingEventListener.PerformExecutions(IEventSource session)
at NHibernate.Event.Default.DefaultFlushEventListener.OnFlush(FlushEvent event)
at NHibernate.Impl.SessionImpl.Flush()
at Xena.Database.Main.Listeners.Strategies.CreateEntityAuditTrailStrategy.Execute(Object criteria) in K:\Projects\Xena\WorkDir\src\Xena.Database.Main\Listeners\Strategies\CreateEntityAuditTrailStrategy.cs:line 41
at Xena.Domain.Rules.Strategies.StrategyExtensions.Execute[TCriteria](IEnumerable`1 strategies, TCriteria criteria) in K:\Projects\Xena\WorkDir\src\Xena.Domain\Rules\Strategies\RelayStrategy.cs:line 55
at NHibernate.Action.EntityInsertAction.PostInsert()
at NHibernate.Action.EntityInsertAction.Execute()
at NHibernate.Engine.ActionQueue.Execute(IExecutable executable)
at NHibernate.Engine.ActionQueue.ExecuteActions(IList list)
at NHibernate.Engine.ActionQueue.ExecuteActions()
at NHibernate.Event.Default.AbstractFlushingEventListener.PerformExecutions(IEventSource session)
at NHibernate.Event.Default.DefaultAutoFlushEventListener.OnAutoFlush(AutoFlushEvent event)
at NHibernate.Impl.SessionImpl.AutoFlushIfRequired(ISet`1 querySpaces)
at NHibernate.Impl.SessionImpl.List(CriteriaImpl criteria, IList results)
at NHibernate.Impl.CriteriaImpl.List(IList results)
at NHibernate.Impl.CriteriaImpl.UniqueResult[T]()
at Xena.Web.EntityUpdaters.LedgerPostPreviewUpdater.TryCreateNewFiscalEntity(ISession session, FiscalSetup fiscalSetup, LedgerPostPreview& entity, IEnumerable`1& errors) in K:\Projects\Xena\WorkDir\src\Xena.Web\EntityUpdaters\LedgerPostPreviewUpdater.cs:line 52
at Xena.Web.SecurityContext.<>c__DisplayClass8_0`1.<TrySaveUpdate>b__0(ISession session, TEntity& entity, IEnumerable`1& errors) in K:\Projects\Xena\WorkDir\src\Xena.Web\SecurityContext.cs:line 235
at Xena.Web.SecurityContext.<>c__DisplayClass41_0`1.<TrySave>b__0(ITransaction tx) in K:\Projects\Xena\WorkDir\src\Xena.Web\SecurityContext.cs:line 815
at Xena.Web.SecurityContext.TryWrapInTransaction[T](Func`2 action, T& result, IEnumerable`1& errors, Boolean alwaysCommit) in K:\Projects\Xena\WorkDir\src\Xena.Web\SecurityContext.cs:line 804
at Xena.Web.SecurityContext.TrySave[TEntity](IEntityUpdater`1 entityUpdater, EntityCreate`1 create) in K:\Projects\Xena\WorkDir\src\Xena.Web\SecurityContext.cs:line 812
at Xena.Web.SecurityContext.TrySaveUpdate[TEntity](IFiscalEntityUpdater`1 entityUpdater) in K:\Projects\Xena\WorkDir\src\Xena.Web\SecurityContext.cs:line 236
at Xena.Web.Api.XenaFiscalApiController.WrapSave[TEntity,TDto](IFiscalEntityUpdater`1 updater, Func`2 get, Action`2 postGet) in K:\Projects\Xena\WorkDir\src\Xena.Web\Api\Abstract\XenaFiscalApiController.cs:line 35
at Xena.Web.Api.ApiLedgerPostPreviewController.Post(LedgerPostPreviewDto ledgerPostPreview) in K:\Projects\Xena\WorkDir\src\Xena.Web\Api\ApiLedgerPostPreviewController.cs:line 79
at lambda_method(Closure , Object , Object[] )
at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ActionExecutor.<>c__DisplayClass10.<GetExecutor>b__9(Object instance, Object[] methodParameters)
at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ExecuteAsync(HttpControllerContext controllerContext, IDictionary`2 arguments, CancellationToken cancellationToken)
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Web.Http.Controllers.ApiControllerActionInvoker.<InvokeActionAsyncCore>d__0.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
at System.Web.Http.Controllers.ActionFilterResult.<ExecuteAsync>d__2.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Web.Http.Filters.AuthorizationFilterAttribute.<ExecuteAuthorizationFilterAsyncCore>d__2.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
at System.Web.Http.Dispatcher.HttpControllerDispatcher.<SendAsync>d__1.MoveNext()

And the following message:

Message=Violation of PRIMARY KEY constraint 'PK_LedgerPostPreview'. Cannot insert duplicate key in object 'dbo.LedgerPostPreview'. The duplicate key value is (94873244).
The statement has been terminated.

The SessionFactory is set to use SnapshotIsolation, the DB is set at compability level 2008 (100)

As far as I can tell, the updating of the hilo value is running in a transaction separate from the "normal" transactions (I've tried causing an exception - the hilo value is not rolled back (which makes sense)).

According to the NHibernate profiler, the SQL run against the server for hilo values is:

Reading high value: 
select next_hi
from   hibernate_unique_key with (updlock, rowlock)
Updating high value: 
update hibernate_unique_key
set    next_hi = 5978 /* @p0 */
where  next_hi = 5977 /* @p1 - next_hi */

What am I missing? Shouldn't the HiLo guard against duplicates?

EDIT: The duplicate IDs are not happening only on one table, but in the tables with very frequent inserts and deletions. The above code was the simplest among the suspects and is extremely simple - it only .Get() a parent to check it is there and then creates and calls .Save() on the new entity along with an audit trail row (which uses the PostInsert eventlistener in nHibernate).

EDIT2: Id-Mapping for the above type (used across all entities):

    public static void MapId<TMapping, TType>(this TMapping mapping)
        where TMapping : ClassMapping<TType>
        where TType : class,IHasId
    {
        mapping.Id(m => m.Id, m => m.Generator(Generators.HighLow, g => g.Params(new { max_lo = 100 })));
    }

The weird part is that (due to @Dexions comment) when I check both the audit trail and the table - nothing has been persisted. The code used to persist is as follows:

using (var tx = Session.BeginTransaction())
{
    try
    {
        var voucherPreview = Session.Get<VoucherPreview>(voucherPreviewId); //Parent
        var postPreview = //Factory create with the voucherPreview;
        var index = Session.QueryOver<LedgerPostPreview>()
            .Where(lpp => lpp.VoucherPreview == voucherPreview)
            .SelectList(l => l.SelectMax(lpp => lpp.Index))
            .SingleOrDefault<int>() + 1
        postPreview.Index = index;
        // Set a few other properties and check validity
        Session.SaveOrUpdate(postPreview);
    }
    catch(Exception ex)
    {
        //Errorhandling leading to the above stacktrace
    }
}
Goblin
  • 7,970
  • 3
  • 36
  • 40
  • Are the hi/lo configuration values set to the exact same values on all application instances? – Oskar Berggren Oct 09 '16 at 15:03
  • Yes - all are set to the same values - we have the same code running on all three instances and the value is set in-code. – Goblin Oct 10 '16 at 08:09
  • So given the SQL you provided (from profiler), what prevents two transactions to read same next_hi value? – Evk Oct 10 '16 at 15:46
  • Concurrent inserts? – Dexion Oct 11 '16 at 07:45
  • Can you find if the duplicates are inserted by the same application? – Jaguar Oct 11 '16 at 11:47
  • @Evk That it sends an updlck with the statement - making the next transaction wait for it to complete the transaction – Goblin Oct 11 '16 at 12:14
  • @Dexion That would mean that either two different webservers somehow violated the transaction-scope while getting og updating the hilo value - or that the SessionFactory is not thread-safe. I'm sorta hoping none-of-those... – Goblin Oct 11 '16 at 12:16
  • @Jaguar: It is the same application - but not necessarily the same webserver (ie. instance). – Goblin Oct 11 '16 at 12:17
  • duplicated insert calls because of some bug in the persistence logic? is the second call immediately after the first or it just comes later (minutes, hours, etc) ? – Dexion Oct 11 '16 at 12:48
  • can you also post your hi-lo config? – Jaguar Oct 11 '16 at 12:53
  • @Dexion - I've added the code responsible for saving the entity. The duplicated insert calls originate here (since no other piece of code succeeds in creating the entry). But I cannot find anything trying to save it doubly. – Goblin Oct 11 '16 at 13:41
  • @Jaguar I've updated my answer with the mapping – Goblin Oct 11 '16 at 13:41
  • Silly questions: is your PK column set to auto increment, identity? Is your mapping collision safe? – Dexion Oct 12 '16 at 05:55
  • @Dexion All Id-columns are not-nullable int64 with a unique constraint - no auto-increment nor identity specified. I haven't heard the term mapping collision before - do you have a definition/elaboration? – Goblin Oct 12 '16 at 07:19
  • another silly question, has the max_lo been changed over time? – Jaguar Oct 12 '16 at 07:28
  • @Jaguar I'm grasping at straws now, so there are no silly questions :-) the max_lo has been the same for close to 4 years now. – Goblin Oct 12 '16 at 07:31
  • Ignore the mapping collision. i think, that the ID generator logic may have a bug - can you set your PKs to autoincrement identity and change the logic so that it can handle the auto generated keys? I wouldn't generate unique int IDs "manually", I would always use auto increment PKs. – Dexion Oct 12 '16 at 07:40
  • @Dexion I could use auto-incrementing, but that would remove a lot of nice-ties (every insert would take two round-trips instead of one, I would no longer have unique id's across tables) – Goblin Oct 12 '16 at 14:32
  • in this case use GUID, this is for that purpose. or prevent concurrent writing with lock, transaction, etc. you are trying to reinvent the wheel. – Dexion Oct 13 '16 at 15:46
  • GUIDs are not without issues - and I have invented nothing - this is the default way of handling number-based Ids in nHibernate without relying on the native ID-generation in MS SQL. – Goblin Oct 14 '16 at 14:53

3 Answers3

1

I figured out the problem. It turns out, it had nothing to do with the Id.

As a part of the insert statement, we update a secondary table that controls a number series. The problem occurs if that secondary table experiences a snapshot isolation fault - since everything is handled in SQLCommandSets internally in nHibernate - the error bubbles up the chain with a faulty description.

Goblin
  • 7,970
  • 3
  • 36
  • 40
  • you are definitely missing something... the error message that you post in the question is one generated by SQL server and not by NHibernate, so indeed a duplicate key insert was attempted. – Jaguar Nov 08 '16 at 20:45
  • Except when I look at the log from the profiler - that is the culprit. And if I remove that race-condition - everything works. I have the tests to prove it. – Goblin Nov 09 '16 at 14:37
  • so it is SQL server that is missreporting? – Jaguar Nov 10 '16 at 13:43
  • 1
    As far as I can tell - it is either the driver for sql server - or sql server it-self. It might be the sql-server reporting multiple errors, and the driver mangling them somehow. I don't really know how to tell the difference. – Goblin Nov 10 '16 at 19:48
0

Given the comment chain on the question, IMHO there are 2 possible cases that I can think of at the moment.

You either misshandle nhibernate sessions and you got a hidden race condition going on when the actual ID generation happens on a given instance (because ID sequencing on the database is transactionally isolated). This assumes that the same application instance successfully inserted { ID=123 } and then attempted to insert an other object with { ID=123 }. You can trace inserts back to application instances to verify the duplication of inserts happens on the same instance. I am not very sure if this scenario is actually plausible on the whole chain of NHibernate pipeline but ISession is not thread-safe (and that is a known fact). You do say that this has been running for 4 years now (although you don't mention if the bug has been there as long), so maybe a recent commit introduced this behaviour (a collection.AsParallel() would be enough to trigger it I believe) ?


A different angle on the problem assumes that an already inserted object was loaded and then detached from the ISession, but got re-attached (by design or by accident) on a (same/different) ISession which then promptly attempted to insert the object. This can happen and a hypothetical scenario could be

  1. var entity123 = Get(123)
  2. var entity123 = entity123.Clone() or ISession.Evict(entity123).
  3. somewhere along the line you call SaveOrUpdate(entity123) (or even worse for tracing you add it on a referenced collection with cascade-save rules)
  4. NHibernate sees an an-managed object with an identifier in place, tries to insert it.

In some earlier version of NHibernate I did see this behaviour with non-identity inserts.

The above could also happen with a bad/dumb factory method that also copies the identifier.

To trace this check if the insertion SQL parameters (for log4net that would be a NHibernate.SQL entry with debug, although I think the NHibernate profiler will expose this as well) match the existing row's column values. If they match exactly then something like the above may be happening. If they match partially, maybe you make partial copies of entities and it mistankenly copies the ID as well.

Jaguar
  • 5,929
  • 34
  • 48
  • A few notes. ISession is not thread-safe - but here everything is happening inside one single transaction in one thread. This is a raw insert, so there is no .Get on the object in question (as we first know the Id after the insert, it would be impossible to guess). We do not use .Merge nor .Evict in this particular part of the code-base and the object has no collection nor is it part of a collection. The Id property is only ever touched by nHibernate (it has a protected setter, so we cannot set it by accident) – Goblin Oct 12 '16 at 14:37
  • The bug has surfaced some 3 months ago - and it is increasing in frequency as the load on the system increases. And it is intermittent - so the bug only happens sometimes and is not reproducable on a local machine. As far as I can tell from the source code nHibernate .Save and .SaveOrUpdate makes exactly the same calls - so I'm a bit stumped. – Goblin Oct 12 '16 at 14:38
  • As a further aside - the id-generation only involves the DB every 100 entities (by design of the hilo algoritm) - and should be thread-safe. – Goblin Oct 12 '16 at 14:44
  • @Goblin well I ran a console utility with multiple concurrent inserts, and I couldn't reproduce the PK violation – Jaguar Oct 14 '16 at 11:40
  • That sounds like my own tests :-) It is as if, heavy load causes a form of "shadow"-flush, so two INSERT statements are sent instead of one - but I cannot figure out, how that would be possible. – Goblin Oct 14 '16 at 14:51
  • @Goblin I just happened to see in your sample that the transaction is created on the Session variable while you also write session.QueryOver. Is this a typographic error? Are you by any chance mixing ISessions? – Jaguar Oct 14 '16 at 15:08
  • That was a typo :-) Everything uses the .Session property (which is handled outside the scope of this code. – Goblin Oct 14 '16 at 15:27
0

what if you just changed to:

postPreview.Index = index+1;
T.Todua
  • 53,146
  • 19
  • 236
  • 237
  • I fail to see how that would change anything. The problem here is the .Id - not the index :-) – Goblin Oct 15 '16 at 13:20