1

I'm currently implementing a MongoDB database for caching.

I've made a very generic client, with the save method working like this:

public virtual void SaveAndOverwriteExistingCollection<T>(string collectionKey, T[] data)
{
    if (data == null || !data.Any())
        return;

    var collection = Connector.MongoDatabase.GetCollection<T>(collectionKey.ToString());
    var filter = new FilterDefinitionBuilder<T>().Empty;

    var operations = new List<WriteModel<T>>
    {
        new DeleteManyModel<T>(filter),
    };

    operations.AddRange(data.Select(t => new InsertOneModel<T>(t)));

    try
    {
        collection.BulkWrite(operations, new BulkWriteOptions {  IsOrdered = true});
    }
    catch (MongoBulkWriteException mongoBulkWriteException)
    {
        throw mongoBulkWriteException;
    }
}

With our other clients, calling this method looking similar to this:

public Person[] Get(bool bypassCache = false)
{
    Person[] people = null;

    if (!bypassCache)
        people = base.Get<Person>(DefaultCollectionKeys.People.CreateCollectionKey());

    if (people.SafeAny())
        return people;

    people = Client<IPeopleService>.Invoke(s => s.Get());
    base.SaveAndOverwriteExistingCollection(DefaultCollectionKeys.People.CreateCollectionKey(), people);

    return people;
}

After we've persisted data to the backend we reload the cache from MongoDB by calling our Get methods, passing the argument true. So we reload all of the data.

This works fine for most use cases. But considering how we are using a Web-garden solution (multiple processes) for the same application this leads to concurrency issues. If I save and reload the cache while another user is reloading the page, sometimes it throws a E11000 duplicate key error collection.

Command createIndexes failed: E11000 duplicate key error collection: cache.Person index: Id_1_Name_1_Email_1 dup key: { : 1, : "John Doe", : "foo@bar.com" }.

Considering how this is a web garden with multiple IIS processes running, locking won't do much good. Considering how bulkwrites should be threadsafe I'm a bit puzzled. I've looked into Upserting the data, but changing our clients to be type specific and updating each field will take too long and feels like unnecessary work. Therefore I'm looking for a very generic solution.

UPDATE Removed the Insert and Delete. Changed it to a collection of ReplaceOneModel. Currently experiencing issues with only the last element in a collection being persisted.

public virtual void SaveAndOverwriteExistingCollection<T>(string collectionKey, T[] data)
{
    if (data == null || !data.Any())
        return;

    var collection = Connector.MongoDatabase.GetCollection<T>(collectionKey.ToString());
    var filter = new FilterDefinitionBuilder<T>().Empty;

    var operations = new List<WriteModel<T>>();
    operations.AddRange(data.Select(t => new ReplaceOneModel<T>(filter, t) { IsUpsert = true }));

    try
    {
        collection.BulkWrite(operations, new BulkWriteOptions { IsOrdered = true });
    }
    catch (MongoBulkWriteException mongoBulkWriteException)
    {
        throw mongoBulkWriteException;
    }
}

Just passed in a collection of 811 items and only the last one can be found in the collection after this method has been executed.

Example of a DTO being persisted:

public class TranslationSetting
{
    [BsonId(IdGenerator = typeof(GuidGenerator))]
    public object ObjectId { get; set; }

    public string LanguageCode { get; set; }

    public string SettingKey { get; set; }

    public string Text { get; set; }
}

With this index:

string TranslationSettings()
{
    var indexBuilder = new IndexKeysDefinitionBuilder<TranslationSetting>()
        .Ascending(_ => _.SettingKey)
        .Ascending(_ => _.LanguageCode);

    return MongoDBClient.CreateIndex(DefaultCollectionKeys.TranslationSettings, indexBuilder);
}
  • Some reason why you would not `UpdateOne` with "upsert"? Of course we cannot see all your architecture but it would "seem" that is possible for multiple "workers" to be issuing the same data ( combination of name and email ). So "upserts" are meant to overcome this problem. In fact there is even `ReplaceOne`, which is basically just an update without atomic operators like `$set`, if your intention is to **always** overwrite. – Neil Lunn Nov 01 '17 at 08:59
  • 1
    Yes, my intention is to always overwrite. Considering how I the method accepts I can't set the fields. The UpdateOne requires fields to be set for the update. Perhaps this could be solved with reflection, but it seems overly complicated. – Petter Pettersson Nov 01 '17 at 09:18
  • Use `ReplaceOne` then. It's not a suggestion, but just how it's done. Bottom line is 2 or more processes asking the database to "overwrite" a possible existing entry or create a new one, is not a problem since the worst case is more writes happen than really need to. But 2 ore more processes issuing Delete and insert in combination is just asking for trouble. So upserts with `ReplaceOne` is how you solve the database end. Longer term you should probably look into how those tasks are being farmed out and why multiple processes are working on the same data. But remove the bad effect first. – Neil Lunn Nov 01 '17 at 09:28
  • Currently trying out this solution. Having issues with only the last element actually being persisted. My guess is that my filter, which is currently set to empty, will tell the ReplaceModel to replace everything that matches the filter. Do you have any experience in solving this? – Petter Pettersson Nov 01 '17 at 10:06
  • You can show the code you are attempting if you have a problem. But the usual case is the "filter" you apply with any "upsert" operation is the "unique key". In your case the `Name` and `Email` fields which you have a unique index on throwing the duplicate key error. – Neil Lunn Nov 01 '17 at 10:08
  • Updated the question now. – Petter Pettersson Nov 01 '17 at 10:17

0 Answers0