65

On an ASP.NET MVC project we have several instances of data that requires good amount of resources and time to build. We want to cache them.

MemoryCache provides certain level of thread-safety but not enough to avoid running multiple instances of building code in parallel. Here is an example:

var data = cache["key"];
if(data == null)
{
  data = buildDataUsingGoodAmountOfResources();
  cache["key"] = data;
}

As you can see on a busy website hundreds of threads could go inside the if statement simultaneously until the data is built and make the building operation even slower, unnecessarily consuming the server resources.

There is an atomic AddOrGetExisting implementation in MemoryCache but it incorrectly requires "value to set" instead of "code to retrieve the value to set" which I think renders the given method almost completely useless.

We have been using our own ad-hoc scaffolding around MemoryCache to get it right however it requires explicit locks. It's cumbersome to use per-entry lock objects and we usually get away by sharing lock objects which is far from ideal. That made me think that reasons to avoid such convention could be intentional.

So I have two questions:

  • Is it a better practice not to lock building code? (That could have been proven more responsive for one, I wonder)

  • What's the right way to achieve per-entry locking for MemoryCache for such a lock? The strong urge to use key string as the lock object is dismissed at ".NET locking 101".

abatishchev
  • 98,240
  • 88
  • 296
  • 433
Sedat Kapanoglu
  • 46,641
  • 25
  • 114
  • 148

6 Answers6

77

We solved this issue by combining Lazy<T> with AddOrGetExisting to avoid a need for a lock object completely. Here is a sample code (which uses infinite expiration):

public T GetFromCache<T>(string key, Func<T> valueFactory) 
{
    var newValue = new Lazy<T>(valueFactory);
    // the line belows returns existing item or adds the new value if it doesn't exist
    var value = (Lazy<T>)cache.AddOrGetExisting(key, newValue, MemoryCache.InfiniteExpiration);
    return (value ?? newValue).Value; // Lazy<T> handles the locking itself
}

That's not complete. There are gotchas like "exception caching" so you have to decide about what you want to do in case your valueFactory throws exception. One of the advantages, though, is the ability to cache null values too.

Sedat Kapanoglu
  • 46,641
  • 25
  • 114
  • 148
  • @ssg - I'm trying to use this and I'm having an issue. On the 1st pass when nothing is in the cache, I don't see how `newValue` is ever actually added to the cache once evaluated? Since `value` is `null` on the 1st pass because there is nothing in the cache, and using `Lazy` means the expression is not evaluated until the following line: `return (value ?? newValue).Value;` How or when does `newValue` have a chance to be added to the cache? What I'm seeing is the cache is never populated on each call, because `newValue` is never added. – atconway May 08 '13 at 18:25
  • 1
    @atconway here is the excerpt from AddOrGetExisting documentation: "Return Value: If a cache entry with the same key exists, the existing cache entry; otherwise, null." – Sedat Kapanoglu May 08 '13 at 18:51
  • @ssg - thanks for the response, but I have read that and understood the method. It's just as your snippet is written and implemented in code for me, the cache is *never* populated even with that method that should add it if it does not exist. A fresh call to `newValue.Value` is required thus evaluating the `valueFactory` expression on each call. By the way your code needed to be modified a tad for me. Your last line in my code had to be the following: `return (value ?? newValue.Value);` – atconway May 08 '13 at 18:57
  • 2
    if key doesn't exist `newValue.Value` is called (because `value` will be null) populating it. otherwise `value.Value` is called which returns existing value. what's wrong about it? return value will give a compile error because both have the type of `Lazy`. you have to return `.Value` for both options. – Sedat Kapanoglu May 08 '13 at 19:26
  • @ssg - this comment cleared it up for me: "return value will give a compile error because both have the type of Lazy. you have to return .Value for both options" For me the generic type `T` was not compiling, so I replaced them with the concrete type (in my case `IEnumerable` which is being stored in cache or retrieved from `valueFactory` expression. When doing this I did not have `as Lazy>` on the end of the line that calls `cache.AddOrGetExisting`. Once I had all the types defined the same the original code worked (except generics) – atconway May 08 '13 at 19:47
  • @ssg - to add to this, I wonder why using `T` would not work for me? I kept getting `Cannot resolve symbol 'T'`. I use generics in my repository often, but in comparison I couldn't figure out why it wouldn't work. Since this was such a specific implementation, replacing it with concrete types works ok for now. – atconway May 08 '13 at 19:49
  • @ssg - Figured it out - had to modify method signature to the following and generic implementation worked: `public T GetFromCache(string key, Func valueFactory)` – atconway May 08 '13 at 19:55
  • 3
    AddOrGetExisting should be cast to `as Lazy`. Otherwise, nice one! – Daniel Lidström Sep 06 '13 at 06:11
  • 2
    You could specify `LazyThreadSafetyMode.PublicationOnly` in the `Lazy` constructor to avoid caching exceptions (if desired). – TrueWill Nov 11 '13 at 17:45
  • could you guys plz help me in the similar problem http://stackoverflow.com/questions/26581065/memory-cache-in-web-api – F11 Oct 27 '14 at 04:27
  • 1
    Be careful with `LazyThreadSafetyMode.PublicationOnly`. With that mode multiple threads can begin executing the initialization method at the same time (with the winner's value being used). If the cost of that method is high (hence the caching), this is probably not desired. The only way to get initialization exclusivity is to use the default mode of `ExecutionAndPublication`. Preventing exception caching will need to be handled manually. – Nick Aug 08 '16 at 22:39
11

For the conditional add requirement, I always use ConcurrentDictionary, which has an overloaded GetOrAdd method which accepts a delegate to fire if the object needs to be built.

ConcurrentDictionary<string, object> _cache = new
  ConcurrenctDictionary<string, object>();

public void GetOrAdd(string key)
{
  return _cache.GetOrAdd(key, (k) => {
    //here 'k' is actually the same as 'key'
    return buildDataUsingGoodAmountOfResources();
  });
}

In reality I almost always use static concurrent dictionaries. I used to have 'normal' dictionaries protected by a ReaderWriterLockSlim instance, but as soon as I switched to .Net 4 (it's only available from that onwards) I started converting any of those that I came across.

ConcurrentDictionary's performance is admirable to say the least :)

Update Naive implementation with expiration semantics based on age only. Also should ensure that individual items are only created once - as per @usr's suggestion. Update again - as @usr has suggested - simply using a Lazy<T> would be a lot simpler - you can just forward the creation delegate to that when adding it to the concurrent dictionary. I'be changed the code, as actually my dictionary of locks wouldn't have worked anyway. But I really should have thought of that myself (past midnight here in the UK though and I'm beat. Any sympathy? No of course not. Being a developer, I have enough caffeine coursing through my veins to wake the dead).

I do recommend implementing the IRegisteredObject interface with this, though, and then registering it with the HostingEnvironment.RegisterObject method - doing that would provide a cleaner way to shut down the poller thread when the application pool shuts-down/recycles.

public class ConcurrentCache : IDisposable
{
  private readonly ConcurrentDictionary<string, Tuple<DateTime?, Lazy<object>>> _cache = 
    new ConcurrentDictionary<string, Tuple<DateTime?, Lazy<object>>>();

  private readonly Thread ExpireThread = new Thread(ExpireMonitor);

  public ConcurrentCache(){
    ExpireThread.Start();
  }

  public void Dispose()
  {
    //yeah, nasty, but this is a 'naive' implementation :)
    ExpireThread.Abort();
  }

  public void ExpireMonitor()
  {
    while(true)
    {
      Thread.Sleep(1000);
      DateTime expireTime = DateTime.Now;
      var toExpire = _cache.Where(kvp => kvp.First != null &&
        kvp.Item1.Value < expireTime).Select(kvp => kvp.Key).ToArray();
      Tuple<string, Lazy<object>> removed;
      object removedLock;
      foreach(var key in toExpire)
      {
        _cache.TryRemove(key, out removed);
      }
    }
  }

  public object CacheOrAdd(string key, Func<string, object> factory, 
    TimeSpan? expiry)
  {
    return _cache.GetOrAdd(key, (k) => { 
      //get or create a new object instance to use 
      //as the lock for the user code
        //here 'k' is actually the same as 'key' 
        return Tuple.Create(
          expiry.HasValue ? DateTime.Now + expiry.Value : (DateTime?)null,
          new Lazy<object>(() => factory(k)));
    }).Item2.Value; 
  }
}
Andras Zoltan
  • 41,961
  • 13
  • 104
  • 160
  • Could you not implement that on top? A monitor thread that polls the contents, expiring items that go out of date could do it. You could even hide it behind the same `ObjectCache` abstract base that `MemoryCache` uses. One of the cool things about ConcurrentDictionary is that it's multiple reader, single-writer. A polling thread can scan at the same time as another retrieves. Once the list of items to be expired is compiled, they can all be done in turn, and the readers automatically wait. – Andras Zoltan May 11 '12 at 22:37
  • This will allow multiple items to be constructed concurrently and all of one of the will be discarded. ConcurrentDictionary does not call user code under a lock. – usr May 11 '12 at 22:48
  • @usr - right; that's a good point. Perhaps another ConcurrentDictionary of `new object()` instances to act as atomic locks for the user code would be an idea? They can be added using `TryAdd`. It has to be said, btw - that I have found in practise that even with an *immense* amount of load (save running lots of threads thrashing the dictionary), such overlaps occur infrequently. – Andras Zoltan May 11 '12 at 22:58
  • @usr - yes, that's a good idea; okay I'm bowing out now - I'm just swinging in the breeze – Andras Zoltan May 11 '12 at 23:10
  • 2
    @usr, couldn't resist, went for a cigarette and updated per your suggestion to use Lazy, from my phone. I'll be using that pattern myself a lot more from now on too I think! – Andras Zoltan May 11 '12 at 23:29
  • Wow, thank you for going great lengths to implement a prototype. But I basically want to avoid such work. As I said we already have some workaround in place and I'm trying to avoid that too. I don't think implementing a new cache manager from scratch is the best idea since there are too many factors to take into account. – Sedat Kapanoglu May 11 '12 at 23:31
  • @ssg no worries. Fair dos :) – Andras Zoltan May 11 '12 at 23:46
  • @usr Regarding [this](http://i.imgur.com/f83P8xt.png) , If there is another slot which uses the _same_ `factory(k)` parameters so it will actually be executed twice....right? – Royi Namir Feb 23 '15 at 10:12
  • @RoyiNamir the "factory" is guarded by Lazy so that it only ever runs once. This `CD>` pattern is standard and safe. – usr Feb 23 '15 at 10:38
  • @usr I think i'm not understood :-). looking [here](http://i.imgur.com/8g8SziQ.png) (notice same parameters for factory) if user asks for `key A` it will run , but if a user asks for `key B` , will the function run _again_ ? – Royi Namir Feb 23 '15 at 10:42
  • @RoyiNamir yes it will run again. The dictionary cannot detect that this is really the same computation (nor is it supposed to). – usr Feb 23 '15 at 11:00
  • @usr Thanks , one last thing : in the original question he did `cache["key"] = LongRunningOperation()` . but in _what_ time does the cache starts _knowing and arming_ "`key`" ? is it at the start of `LongRunningOperation` or at the end of it ? – Royi Namir Feb 23 '15 at 11:24
  • @RoyiNamir at the moment you call Lazy.Value because that calls LongRunningOperation. – usr Feb 23 '15 at 11:28
  • @usr I meant [this](http://i.imgur.com/pk1pHKT.png) But i've already answered myself , it is not armed , but only AFTER the method finished - [proof](http://i.imgur.com/mUcBtmv.png). Thank you. – Royi Namir Feb 23 '15 at 11:38
2

Taking the top answer into C# 7, here's my implementation that allows storage from any source type T to any return type TResult.

/// <summary>
/// Creates a GetOrRefreshCache function with encapsulated MemoryCache.
/// </summary>
/// <typeparam name="T">The type of inbound objects to cache.</typeparam>
/// <typeparam name="TResult">How the objects will be serialized to cache and returned.</typeparam>
/// <param name="cacheName">The name of the cache.</param>
/// <param name="valueFactory">The factory for storing values.</param>
/// <param name="keyFactory">An optional factory to choose cache keys.</param>
/// <returns>A function to get or refresh from cache.</returns>
public static Func<T, TResult> GetOrRefreshCacheFactory<T, TResult>(string cacheName, Func<T, TResult> valueFactory, Func<T, string> keyFactory = null) {
    var getKey = keyFactory ?? (obj => obj.GetHashCode().ToString());
    var cache = new MemoryCache(cacheName);
    // Thread-safe lazy cache
    TResult getOrRefreshCache(T obj) {
        var key = getKey(obj);
        var newValue = new Lazy<TResult>(() => valueFactory(obj));
        var value = (Lazy<TResult>) cache.AddOrGetExisting(key, newValue, ObjectCache.InfiniteAbsoluteExpiration);
        return (value ?? newValue).Value;
    }
    return getOrRefreshCache;
}

Usage

/// <summary>
/// Get a JSON object from cache or serialize it if it doesn't exist yet.
/// </summary>
private static readonly Func<object, string> GetJson =
    GetOrRefreshCacheFactory<object, string>("json-cache", JsonConvert.SerializeObject);


var json = GetJson(new { foo = "bar", yes = true });
cchamberlain
  • 17,444
  • 7
  • 59
  • 72
2

Here is simple solution as MemoryCache extension method.

 public static class MemoryCacheExtensions
 {
     public static T LazyAddOrGetExitingItem<T>(this MemoryCache memoryCache, string key, Func<T> getItemFunc, DateTimeOffset absoluteExpiration)
     {
         var item = new Lazy<T>(
             () => getItemFunc(),
             LazyThreadSafetyMode.PublicationOnly // Do not cache lazy exceptions
         );

         var cachedValue = memoryCache.AddOrGetExisting(key, item, absoluteExpiration) as Lazy<T>;

         return (cachedValue != null) ? cachedValue.Value : item.Value;
     }
 }

And test for it as usage description.

[TestMethod]
[TestCategory("MemoryCacheExtensionsTests"), TestCategory("UnitTests")]
public void MemoryCacheExtensions_LazyAddOrGetExitingItem_Test()
{
    const int expectedValue = 42;
    const int cacheRecordLifetimeInSeconds = 42;

    var key = "lazyMemoryCacheKey";
    var absoluteExpiration = DateTimeOffset.Now.AddSeconds(cacheRecordLifetimeInSeconds);

    var lazyMemoryCache = MemoryCache.Default;

    #region Cache warm up

    var actualValue = lazyMemoryCache.LazyAddOrGetExitingItem(key, () => expectedValue, absoluteExpiration);
    Assert.AreEqual(expectedValue, actualValue);

    #endregion

    #region Get value from cache

    actualValue = lazyMemoryCache.LazyAddOrGetExitingItem(key, () => expectedValue, absoluteExpiration);
    Assert.AreEqual(expectedValue, actualValue);

    #endregion
}
Oleg
  • 132
  • 1
  • 4
2

Sedat's solution of combining Lazy with AddOrGetExisting is inspiring. I must point out that this solution has a performance issue, which seems very important for a solution for caching.

If you look at the code of AddOrGetExisting(), you will find that AddOrGetExisting() is not a lock-free method. Comparing to the lock-free Get() method, it wastes the one of the advantage of MemoryCache.

I would like to recommend to follow solution, using Get() first and then use AddOrGetExisting() to avoid creating object multiple times.

public T GetFromCache<T>(string key, Func<T> valueFactory) 
{
    T value = (T)cache.Get(key);
    if (value != null)
    {
        return value;
    }

    var newValue = new Lazy<T>(valueFactory);
    // the line belows returns existing item or adds the new value if it doesn't exist
    var oldValue = (Lazy<T>)cache.AddOrGetExisting(key, newValue, MemoryCache.InfiniteExpiration);
    return (oldValue ?? newValue).Value; // Lazy<T> handles the locking itself
}
Albert Ma
  • 21
  • 2
  • Good point, Albert! I must also point out that creating Lazy for every building operation can be GC sensitive too. Also, Lazy isn't async compatible. So we stopped using Lazy for this altogether, but it can be a good starting point. – Sedat Kapanoglu Jul 05 '19 at 23:06
  • @SedatKapanoglu, are certain that Lazy is not compatible with async? What would be your arguments? Counterargument is posted here: https://devblogs.microsoft.com/pfxteam/asynclazyt/ – Almis Jul 09 '19 at 05:29
  • To add to the performance issues - the Get method also has a lock but it locks only when the key expires. – Almis Jul 09 '19 at 06:02
  • @Almis my argument is in the article you shared. – Sedat Kapanoglu Jul 09 '19 at 06:17
  • As Sedat mentioned in comments, Lazy isn't async compatible. If the valueFacture is async , we you use the AsyncLazy created by others: https://github.com/StephenCleary/AsyncEx/wiki/AsyncLazy – Albert Ma Jul 25 '19 at 02:09
  • @AlbertMa There is a Microsoft supported [`AsyncLazy`](https://learn.microsoft.com/en-us/dotnet/api/microsoft.visualstudio.threading.asynclazy-1) as part of the very good Microsoft.VisualStudio.Threading library which is, despite its name, not Visual Studio exclusive but should be used in every async .NET application. – nkr Mar 02 '21 at 10:10
1

Here is a design that follows what you seem to have in mind. The first lock only happens for a short time. The final call to data.Value also locks (underneath), but clients will only block if two of them are requesting the same item at the same time.

public DataType GetData()
{      
  lock(_privateLockingField)
  {
    Lazy<DataType> data = cache["key"] as Lazy<DataType>;
    if(data == null)
    {
      data = new Lazy<DataType>(() => buildDataUsingGoodAmountOfResources();
      cache["key"] = data;
    }
  }

  return data.Value;
}
Neil
  • 7,227
  • 5
  • 42
  • 43