6

We've got some Azure Functions defined in a class using [FunctionName] attributes from the WebJobs SDK. There are several functions in the class and they all need access to secrets stored in an Azure KeyVault. The problem is that we have many hundreds invocations of the functions a minute, and since each one is making a call to the KeyVault, KeyVault is failing with a message saying something like, "Too many connections. Usually only 10 connections are allowed."

@crandycodes (Chris Anderson) on Twitter suggested making the KeyVaultClient static. However, the constructor we're using for the KeyVaultClient requires a delegate function for the constructor, and you can't use a static method as a delegate. So how can we make the KeyVaultClient static? That should allow the functions to share the client, reducing the number of sockets.

Here's our KeyVaultHelper class:

public class KeyVaultHelper
{
    public string ClientId { get; protected set; }

    public string ClientSecret { get; protected set; }

    public string VaultUrl { get; protected set; }

    public KeyVaultHelper(string clientId, string secret, string vaultName = null)
    {
        ClientId = clientId;
        ClientSecret = secret;
        VaultUrl = vaultName == null ? null : $"https://{vaultName}.vault.azure.net/";
    }

    public async Task<string> GetSecretAsync(string key)
    {
        try
        {

            using (var client = new KeyVaultClient(new KeyVaultClient.AuthenticationCallback(GetAccessTokenAsync),
                new HttpClient()))
            {
                var secret = await client.GetSecretAsync(VaultUrl, key);
                return secret.Value;
            }
        }
        catch (Exception ex)
        {
            throw new ApplicationException($"Could not get value for secret {key}", ex);
        }
    }

    public async Task<string> GetAccessTokenAsync(string authority, string resource, string scope)
    {
        var authContext = new AuthenticationContext(authority, TokenCache.DefaultShared);
        var clientCred = new ClientCredential(ClientId, ClientSecret);
        var result = await authContext.AcquireTokenAsync(resource, clientCred);

        if (result == null)
        {
            throw new InvalidOperationException("Could not get token for vault");
        }

        return result.AccessToken;
    }
}

Here's how we reference the class from our functions:

public class ProcessorEntryPoint
{
    [FunctionName("MyFuncA")]
    public static async Task ProcessA(
        [QueueTrigger("queue-a", Connection = "queues")]ProcessMessage msg,
        TraceWriter log
        )
    {
        var keyVaultHelper = new KeyVaultHelper(CloudConfigurationManager.GetSetting("ClientId"), CloudConfigurationManager.GetSetting("ClientSecret"),
            CloudConfigurationManager.GetSetting("VaultName"));
        var secret = keyVaultHelper.GetSecretAsync("mysecretkey");
        // do a stuff
    }

    [FunctionName("MyFuncB")]
    public static async Task ProcessB(
        [QueueTrigger("queue-b", Connection = "queues")]ProcessMessage msg,
        TraceWriter log
        )
    {
        var keyVaultHelper = new KeyVaultHelper(CloudConfigurationManager.GetSetting("ClientId"), CloudConfigurationManager.GetSetting("ClientSecret"),
            CloudConfigurationManager.GetSetting("VaultName"));
        var secret = keyVaultHelper.GetSecretAsync("mysecretkey");
        // do b stuff
    }
}

We could make the KeyVaultHelper class static, but that in turn would need a static KeyVaultClient object to avoid creating a new connection on each function call - so how do we do that or is there another solution? We can't believe that functions that require KeyVault access are not scalable!?

Colin Dembovsky
  • 165
  • 3
  • 7

3 Answers3

4

You can use a memory cache and set the length of the caching to a certain time which is acceptable in your scenario. In the following case you have a sliding expiration, you can also use a absolute expiration, depending on when the secrets change.

public async Task<string> GetSecretAsync(string key)
{
    MemoryCache memoryCache = MemoryCache.Default;
    string mkey = VaultUrl + "_" +key;
    if (!memoryCache.Contains(mkey))
    {
      try
      {

          using (var client = new KeyVaultClient(new KeyVaultClient.AuthenticationCallback(GetAccessTokenAsync),
            new HttpClient()))
          {
               memoryCache.Add(mkey, await client.GetSecretAsync(VaultUrl, key), new CacheItemPolicy() { SlidingExpiration = TimeSpan.FromHours(1) });
          }
      }
      catch (Exception ex)
      {
          throw new ApplicationException($"Could not get value for secret {key}", ex);
      }
      return memoryCache[mkey] as string;
    }
}
Peter
  • 27,590
  • 8
  • 64
  • 84
  • Thanks @Peter - we ended up doing something like this. It's a pity that you can't scale functions that require KeyVault - seems a bit of a sticky problem. The MemCache feels kludgy. – Colin Dembovsky Aug 23 '17 at 04:11
  • Instead of using MemCache, you can just store it as a static... unless you actually need to take advantage MemCache's more advanced features. See https://stackoverflow.com/a/50881051/1152054 – Dan Friedman Jun 15 '18 at 18:39
1

try the following changes in the helper:

public class KeyVaultHelper
{
    public string ClientId { get; protected set; }

    public string ClientSecret { get; protected set; }

    public string VaultUrl { get; protected set; }

    KeyVaultClient client = null;

    public KeyVaultHelper(string clientId, string secret, string vaultName = null)
    {
        ClientId = clientId;
        ClientSecret = secret;
        VaultUrl = vaultName == null ? null : $"https://{vaultName}.vault.azure.net/";
        client = new KeyVaultClient(new KeyVaultClient.AuthenticationCallback(GetAccessTokenAsync), new HttpClient());
    }

    public async Task<string> GetSecretAsync(string key)
    {
        try
        {
            if (client == null)
                client = new KeyVaultClient(new KeyVaultClient.AuthenticationCallback(GetAccessTokenAsync), new HttpClient());

            var secret = await client.GetSecretAsync(VaultUrl, key);
            return secret.Value;
        }
        catch (Exception ex)
        {
            if (client != null)
            {
                client.Dispose();
                client = null;
            }
            throw new ApplicationException($"Could not get value for secret {key}", ex);
        }
    }

    public async Task<string> GetAccessTokenAsync(string authority, string resource, string scope)
    {
        var authContext = new AuthenticationContext(authority, TokenCache.DefaultShared);
        var clientCred = new ClientCredential(ClientId, ClientSecret);
        var result = await authContext.AcquireTokenAsync(resource, clientCred);

        if (result == null)
        {
            throw new InvalidOperationException("Could not get token for vault");
        }

        return result.AccessToken;
    }
}

now, the function can use a default static constructor to keep the client proxy:

public static class ProcessorEntryPoint
{
    static KeyVaultHelper keyVaultHelper;

    static ProcessorEntryPoint()
    {
        keyVaultHelper = new KeyVaultHelper(CloudConfigurationManager.GetSetting("ClientId"), CloudConfigurationManager.GetSetting("ClientSecret"), CloudConfigurationManager.GetSetting("VaultName"));
    }

    [FunctionName("MyFuncA")]
    public static async Task ProcessA([QueueTrigger("queue-a", Connection = "queues")]ProcessMessage msg, TraceWriter log )
    {           
        var secret = keyVaultHelper.GetSecretAsync("mysecretkey");
        // do a stuff

    }

    [FunctionName("MyFuncB")]
    public static async Task ProcessB([QueueTrigger("queue-b", Connection = "queues")]ProcessMessage msg, TraceWriter log )
    {
        var secret = keyVaultHelper.GetSecretAsync("mysecretkey");
        // do b stuff

    }
}
Roman Kiss
  • 7,925
  • 1
  • 8
  • 21
  • The problem with storing the whole client is that eventually, the token is going to expire. – Dan Friedman Jun 15 '18 at 18:37
  • Are you sure? Have a look at a feature of the TokenCache for handling an access/refresh token pair and their lifetime, https://learn.microsoft.com/en-us/dotnet/api/microsoft.identitymodel.clients.activedirectory.tokencache?view=azure-dotnet and https://learn.microsoft.com/en-us/azure/active-directory/active-directory-configurable-token-lifetimes – Roman Kiss Jun 16 '18 at 17:05
0

You don't actually want KeyVault to scale like that. It is protecting you from racking up unnecessary costs and slow behavior. All you need to do it save the secret for later use. I've created a static class for static instantiation.

public static class KeyVaultHelper
{
    private static Dictionary<string, string> Cache = new Dictionary<string, string>();

    public static async Task<string> GetSecretAsync(string secretIdentifier)
    {
        if (Cache.ContainsKey(secretIdentifier))
            return Cache[secretIdentifier];

        var kv = new KeyVaultClient(new KeyVaultClient.AuthenticationCallback(GetToken));
        var secretValue = (await kv.GetSecretAsync(secretIdentifier)).Value;
        Cache[secretIdentifier] = secretValue;
        return secretValue;
    }

    private static async Task<string> GetToken(string authority, string resource, string scope)
    {
        var clientId = ConfigurationManager.AppSettings["ClientID"];
        var clientSecret = ConfigurationManager.AppSettings["ClientSecret"];
        var clientCred = new ClientCredential(clientId, clientSecret);

        var authContext = new AuthenticationContext(authority);
        AuthenticationResult result = await authContext.AcquireTokenAsync(resource, clientCred);

        if (result == null)
            throw new InvalidOperationException("Failed to obtain the JWT token");

        return result.AccessToken;
    }
}

Now in your code, you can do something like this:

private static readonly string ConnectionString = KeyVaultHelper.GetSecretAsync(ConfigurationManager.AppSettings["SqlConnectionSecretUri"]).GetAwaiter().GetResult();

Now whenever you need your secret, it is immediately there.

NOTE: If Azure Functions ever shuts down the instance due to lack of use, the static goes away and is reloaded the next time the function is called. Or you can your own functionality to reload the statics.

Dan Friedman
  • 4,941
  • 2
  • 41
  • 65
  • 1
    It is a bad practice for serverless components to store a shareable secret value(s) in the "unknown-invisible" host processes like is done in the above implementation. The secrets are shared across the applications, workers, region, etc. and when we need to change their version, it will be necessary to restart all host processes. Building own functionality to reload the statics in all "unknown-invisible" host processes is a huge task. The other problem can be a security issue for saving a decrypted secret value in the static object of the host process. – Roman Kiss Jun 16 '18 at 17:52
  • 1
    False. The secret is only available to the Function app as each app is run in isolation. A Function app is also only deployed to one region, so multiple regions would require multiple app instances and thus they do not communicate. Storing a secret here is functionally equivalent to how the [ASP.NET Core template stores settings](https://www.dantheprogrammer.com/posts/keeping-secrets) (which include secrets). – Dan Friedman Jun 20 '18 at 17:03
  • Also, if you are going to change the secret, then you must have a process for dealing with that change in your app. You can create a HTTP Trigger to hit that reloads the value. Or a Timer Trigger. Or whatever process makes sense for your app. Again, in ASP.NET Core, there is a way to reload settings `Configuration.Reload()`, but how it is reloaded is up to the user. – Dan Friedman Jun 20 '18 at 17:07
  • In your solution (KeyVaultHelper), each instance of the FunctionApp (during the scalling out) will made a copy of the secret value from the KV. When the secret-KV is shared by multiple distributed enterprise applications and each enterprise application is scalling-out on multiple VirtualMachines, then there are hundred decrypted secret values in the instances on the multiple VMs across the regions. – Roman Kiss Jun 21 '18 at 06:08
  • In the case of the changing a secret-KV version, there is no built-in a broadcasting mechanism to invalidate a secret-copy stored in each app instance. Using a HttpTrigger function will invalidate a cache in one random selected (based on load balancer) instance, not all instances created on the multiple VMs. Using the time-to-live pattern for values in the cache is easy way for invalidating a cache, that's is done in the MemoryCache, TokenCache, etc. Other option is to use a distributed cache. – Roman Kiss Jun 21 '18 at 06:09