I've had a similar issue using BulkWriteAsync
with a ReplaceOneModel
(you use BulkWrite
... I suggest you go async, but that's an aside).
I have a strategy that performed well and deals with a bulk insert of tons of documents. I think my method combined with the method of making smaller batches in the first place might do the trick (ref. this answer). The strategy is composed of a few tactics...
Tactic One: use IsOrdered
Pass a BulkWriteOptions to BulkWriteAsync
with IsOrdered set to false
. IsOrdered
allows MongoDB to perform the inserts faster. More importantly, it may, in your case, allow the operation to succeed for many documents (not necessarily all -- see strategy 2).
NOTE: That last link I provided is to the Java documentation which says,
If true, then when a write fails, return without performing the remaining writes.
The C# documentation talks only about the ordering itself. Anyway, this use of IsOrdered
being false sets up the second strategy.
Tactic 2: When the bulk insert fails, retry the remainder
The thing is, a full bulk insert on a large number of documents will likely fail -- but not completely. This is actually ok, because MongoDB gives you a nice out: a list of the documents that couldn't be written (inserted) as part of the MongoBulkWriteException that is thrown. Again, my scenario was an upsert, but I'm willing to bet a strict insert will have the same/similar issues.
Using the WriteErrors
property of this exception, you can get a list of those documents, and retry them.
Possible Solution
Here I outline a possible solution. I'm not going to provide the entire set of code because I can't and because I have some other strategies in play such as retries with exponential backoff and jitter, it's part of a generic repository implementation, I deal with various other errors, and so on -- things that may or may not be relevant to this answer.
So, pseudocode using documents of type T
:
// setup models; here's mine; yours will be `InsertOneModel`
var models = toUpdate
.Select(x => new ReplaceOneModel<T>(new ExpressionFilterDefinition<T>(doc => doc.Id == x.Id), x) { IsUpsert = true })
.ToArray();
Then make a retry loop and handle the needful exception, which exposes a WriteErrors
property of type IReadOnlyList<BulkWriteError>
. There's a bit of assumptions you'll have to make about my code; in the error handler, I have to match up the error to the original model so I can ensure I have the right ones. If you don't get the idea, I can try to add more context.
for (var i = 0; ; i++)
{
try
{
await collection.BulkWriteAsync(models, new BulkWriteOptions { IsOrdered = false });
return result;
}
catch (MongoBulkWriteException<T> bwe)
{
if (i > DelayCount) return something;
// rebuild the collection of models, using the failed ones; EntityModel
// basically contains the entity, and the error category (you may not
// want to retry all of them depending on the category)
var myModels = bwe.WriteErrors.Select(x => new EntityModel<T> { Entity = models[x.Index].Replacement, ErrorCategory = (RepoErrorCategory)x.Category }).ToArray();
models = handlerResult.Replacements
.Select(x => new ReplaceOneModel<T>(new ExpressionFilterDefinition<T>(doc => doc.Id == x.Id), x) { IsUpsert = upsert })
.ToArray();
// maybe do an optional delay here...
}
}
The key to matching up the errors to the models is using the Index
property of the WriteError
in the list of errors to lookup the original entity (document) you are trying to save.
At this point, after as few iterations as possible, everything should be inserted and the size and time limits have been circumvented.
Essentially, on each pass, you are winnowing down the remaining work each time to just the failed documents. Ideally, that's fewer each time.
Final Advice
I suggest straying away from making this transactional if you can. These tactics probably won't work correctly in that scenario. For example, it seems rather obvious that a transactional insert of 10000 documents won't be able to make use of WriteErrors
. I'd be willing to bet that 10000 documents in most cases are unrelated, and their inserts can be independent.
Avoiding the transaction will allow you to skirt the limits much more easily.