I want to get all the records from the database using EF Core and then save them to another database. I have 149541 rows in the database and I couldn't fetch them all at one time because it is very slow (maybe more than one hour) so I decided to do some Skip and Take. I did it in a loop where I icrease skip number every iteration but for some reason this loop never ended.
then I did some logging and saw that skip variable became zero.
var start = DateTime.UtcNow;
int skipAmount;
const int step = 500;
var parallelTasksBatchSize = step / 10;
var totalEmailMessagesCount = await _context.Set<EmailMessage>().CountAsync();
_errorLoggerQueue.LogError(new InformationLog($"Operation started. email messages count: {totalEmailMessagesCount}"), operationName);
var client = new MyClient(_restClient);
for (skipAmount = 0; skipAmount < totalEmailMessagesCount; skipAmount += step)
{
var emailMessages = await _context.Set<EmailMessage>()
.Include(x => x.EmailSubTemplates)
.OrderBy(x => x.CreatedAt)
.Skip(skipAmount)
.Take(step)
.AsNoTracking()
.ToListAsync();
var tasks = new List<Task<RestResponse>>();
foreach (var emailMessage in emailMessages)
{
var emailMessageLogDto = new EmailMessageLogDto
{
//not important what properties I have there.
//and it's log of email that had been sent from this application
//it's not a log of something that happening in this script
//I'm just moving those email messages from ome DB to another
};
tasks.Add(client.CreateEmailMessageLogAsync(emailMessageLogDto));
if (tasks.Count == parallelTasksBatchSize)
{
var responses = await Task.WhenAll(tasks);
tasks.Clear();
var contentsOfFailedResponses =
responses.Where(x => !x.IsSuccessful).Select(x => x.Content).ToList();
var failFeedback = contentsOfFailedResponses.Count == 0
? string.Empty
: string.Join(" |_| ", contentsOfFailedResponses);
if (!string.IsNullOrEmpty(failFeedback))
{
_errorLoggerQueue.LogError(new InformationLog($"timeNow: {DateTime.UtcNow.ToShortTimeString()}, failFeedback: {failFeedback}"), operationName);
}
}
}
_errorLoggerQueue.LogError(new InformationLog($"skip: {skipAmount}, step: {step}, totalCount: {totalEmailMessagesCount}, number of emailMessages = {emailMessages.Count}"), operationName);
}
_errorLoggerQueue.LogError(new InformationLog($"start time: {start}, finish time: {DateTime.UtcNow}, finish time - start time = {DateTime.UtcNow - start}"), operationName);
and it's working ok but in one moment it did some really weird thing. so the loop never ends. see the screenshot of the logs
And here is the calling code (.InitAsync()
)
namespace SuperNameSpace
{
public class Program
{
public static async Task Main(string[] args)
{
CultureInfo cultureInfo = new CultureInfo("en-US");
cultureInfo.NumberFormat.CurrencySymbol = "$";
CultureInfo.DefaultThreadCurrentCulture = cultureInfo;
CultureInfo.DefaultThreadCurrentUICulture = cultureInfo;
IWebHost webHost = CreateWebHostBuilder(args).Build();
await webHost.Services.GetRequiredService<IInitializationService>().InitAsync();
webHost.Run();
}
public static IWebHostBuilder CreateWebHostBuilder(string[] args)
{
return WebHost.CreateDefaultBuilder(args).UseStartup<Startup>();
}
}
}
Application and DB is deployed on azure
I assume that Azure app service reruns Main
method after some timeout because web application is not actually running at the moment of initialization. Can I somehow prevent it from happening? Or is there any better approach for running some kind of initialization scripts?
For now I put this code in a simple command (This application uses CQRS) and triggered it with Postman. Command has been complete in 47 minutes and it worked fine. everything was moved without failed responses