2

We have several Hangfire jobs, and occasionally they will bring down our entire site. Hangfire isn't catching the errors, and we even added a try/catch around the ProcessCore() call, but it's not hitting that code.

Has anyone else had this issue? Is there some kind of specific setup that needs to be done to avoid this? (In the error below, BackgroundServices.UpdateClientStats is the Hangfire job)

RecurringJob.AddOrUpdate("updateClientStats", () => container.GetInstance<UpdateClientStats>().Process(null), Cron.Daily(7));




public class UpdateClientStats : BackgroundAppService {
        private readonly ClientStatsHelper clientStatsHelper;
        public UpdateClientStats(DBContext context, IDomainRuleViolationCollection domainRuleViolationCollection, ClientStatsHelper clientStatsHelper) : base(context, domainRuleViolationCollection) {
            this.clientStatsHelper = clientStatsHelper;
        }

       protected async override Task ProcessCore() {
            var userIds = context.Users.Where(x => x.Status == ClientStatus.Active && x.TeamMemberRole.HasFlag(TeamMemberRole.Runner)).Select(x => x.Id).ToList();

            userIds.ForEach(async userId => {
                await clientStatsHelper.UpdateClientStats(userId, false, false);
            });
        }
    }


public async Task UpdateClientStats(long clientUserId, bool skipJournalEntries, bool skipLastScheduledWorkoutDate) {
            using (var scope = serviceScopeFactory.CreateScope()) {
                var context = scope.ServiceProvider.GetService<DBContext>();

... more code logic here...

                var user = await context.Users.Where(x => x.Id == clientUserId).FirstOrDefaultAsync();

                if (user != null) {
                    user.UpdateNextScheduledEvent(await Helpers.ClientHelper.GetClientNextEventDate(context.ClientProgramWorkoutDays.AsQueryable(), clientUserId));

... more logic here...

                    user.UpdateStats(completionPercentage, pastCompletionPercentage, durations7Days.Miles, durationsPast7Days.Miles, durations7Days.TimeInSeconds,
                        durationsPast7Days.TimeInSeconds, durations30Days.Miles, durationsPast30Days.Miles, durations30Days.TimeInSeconds, durationsPast30Days.TimeInSeconds,
                        durationsTrailing7to34Days.Miles, durationsTrailing7to34Days.TimeInSeconds, intensityInSeconds7Days, intensityInSecondsTrailing7to34Days);


                    await context.SaveChangesAsync();
                }
            }
        }



public abstract class BackgroundAppService : BaseAppService {
        public BackgroundAppService(DBContext context, IDomainRuleViolationCollection domainRuleViolationCollection) : base(context, domainRuleViolationCollection) {

        }
        protected abstract Task ProcessCore();

        [DisableConcurrentExecution(timeoutInSeconds: 10)]
        [AutomaticRetry(Attempts = 0, OnAttemptsExceeded = AttemptsExceededAction.Delete)]
        public void Process(PerformContext performContext) {
            using (LogContext.PushProperty("ApplicationName", "Hangfire"))
            using (LogContext.PushProperty("Hangfirejob", this.GetType().Name))
            using (LogContext.PushProperty("HangfireJobID", performContext?.BackgroundJob?.Id))
            using (LogContext.Push(new PerformContextEnricher(performContext))) {
                Log.Information("Job {jobName} Started", this.GetType().Name);

                try {
                    ProcessCore().Wait();
                } catch (Exception ex) {
                    Log.Error(ex.ToString());
                    throw ex;
                }

                Log.Information("Job {jobName} Finished", this.GetType().Name);
            }

        }

    }```


Your app crashed because of System.InvalidOperationException

Your app, crashed because of System.InvalidOperationException and aborted the requests it was processing when the overflow occurred. As a result, your app’s users may have experienced HTTP 502 errors.

This call stack caused the exception:

Microsoft.Data.Common.ADP.ExceptionWithStackTrace
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
Microsoft.EntityFrameworkCore.Storage.RelationalConnection+d__50.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
Microsoft.EntityFrameworkCore.Storage.RelationalConnection+d__50.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
Microsoft.EntityFrameworkCore.Storage.RelationalConnection+d__47.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
Microsoft.EntityFrameworkCore.Storage.RelationalCommand+d__17.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
Microsoft.EntityFrameworkCore.Query.RelationalShapedQueryCompilingExpressionVisitor+AsyncQueryingEnumerable`1+AsyncEnumerator+d__17[[System.__Canon System.Private.CoreLib]].MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
Microsoft.EntityFrameworkCore.Query.ShapedQueryCompilingExpressionVisitor+d__20`1[[System.__Canon System.Private.CoreLib]].MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
Microsoft.EntityFrameworkCore.Query.ShapedQueryCompilingExpressionVisitor+d__20`1[[System.__Canon System.Private.CoreLib]].MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
OurApp.Application.Helpers.ClientStatsHelper+d__2.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
OurApp.Application.BackgroundServices.UpdateClientStats+<b__2_2>d.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Threading.Tasks.Task+<>c.b__139_1
System.Threading.QueueUserWorkItemCallback+<>c.<.cctor>b__6_0
System.Threading.ExecutionContext.RunForThreadPoolUnsafe[[System.__Canon System.Private.CoreLib]]
System.Threading.QueueUserWorkItemCallback.Execute
System.Threading.ThreadPoolWorkQueue.Dispatch
System.Threading._ThreadPoolWaitCallback.PerformWaitCallback
christie
  • 71
  • 4
  • You should provide the relevant code enqueueing the job, as well as the relevant code of the job itself (especially relevant if you start a new thread at some point) – jbl Jan 10 '22 at 17:04

2 Answers2

3

I would say that the problem comes from

userIds.ForEach(async userId => {
    await clientStatsHelper.UpdateClientStats(userId, false, false);
});

you are implicitly creating an async void delegate where exceptions will be tricky to handle. See "Avoid async void"

You should try :

foreach (var userId in userIds) {
      await clientStatsHelper.UpdateClientStats(userId, false, false);
    }

if you want to do all your processing concurrently, you may try :

var tasks = new List<Task>();
foreach (var userId in userIds) {
      tasks.Add(clientStatsHelper.UpdateClientStats(userId, false, false));
    }
await Task.WhenAll(tasks);

You should also avoid rethrowing errors with throw ex; see Is there a difference between "throw" and "throw ex"?

jbl
  • 15,179
  • 3
  • 34
  • 101
  • I work with christie. Thanks, this is helpful. We do the async delegate to run these in parallel. Is there a way we can do that and still correctly handle exceptions? UpdateClientStats uses a IServiceScopeFactory to create a dbcontext so its completely isolated. It processes each user, but it needs to do them in parallel or it will take forever. – Chris Kooken Jan 11 '22 at 15:27
  • @ChrisKooken I completed my answer – jbl Jan 11 '22 at 16:08
0

The answer above from @jbl was very helpful. Here's what I ultimately did to solve the issue (used this article as well: https://devblogs.microsoft.com/pfxteam/implementing-a-simple-foreachasync-part-2/)

I changed this:

userIds.ForEach(async userId => {
    await clientStatsHelper.UpdateClientStats(userId, false, false);
});

to this:

await userIds.ForEachAsyncAggregateException(20, async userId => {
   await clientStatsHelper.UpdateClientStats(userId, false, false);
});

using this extension method:

public static Task ForEachAsyncAggregateException<T>(this IEnumerable<T> source, int dop, Func<T, Task> body) {
            var exceptions = new ConcurrentBag<AggregateException>();
            void ObserveException(Task task) {
                if (task.Exception != null) {
                    exceptions.Add(task.Exception);
                }
            }
            void RaiseExceptions(Task _) {
                if (exceptions.Any())
                    throw (exceptions.Count == 1 ? exceptions.Single() : new AggregateException(exceptions))
                        .Flatten();
            }
            return Task.WhenAll(
                    from partition in Partitioner.Create(source).GetPartitions(dop)
                    select Task.Run(async delegate {
                        using (partition)
                            while (partition.MoveNext())
                                await body(partition.Current)
                                    .ContinueWith(ObserveException);
                    }))
                .ContinueWith(ObserveException)
                .ContinueWith(RaiseExceptions);
        }
christie
  • 71
  • 4