How to secure real-world load-balanced WCF service against replay attacks

Question

Whilst creating a new WCF service endpoint to improve our webservice security, I started looking at how to prevent replay attacks. On initial glance, this is easy, WCF has a "DetectReplays" flag that you turn on and everything is sorted. However, even a brief understanding of the mechanism in use (in memory nonce caching and duplicate rejection), shows that this is not a real-world implementation. Frankly it's baffling that they implement it at all. Anyone sufficiently bothered about security at this level is going to be running more than one server in their web-farm, and consequently this mechanism will allow N attacks where N is the number of servers you have. Thus nullifying any scaling you have to cope with surges in traffic, and possibly overwhelming the servers. No to mention the chaos that duplicate create calls will cause.

We could turn on sticky sessions... but lets not do that, as that's a whole different set of problems.

Further investigation shows that Microsoft themselves acknowledge this problem: https://learn.microsoft.com/en-us/dotnet/framework/wcf/feature-details/preventing-replay-attacks-when-a-wcf-service-is-hosted-in-a-web-farm Even by Microsoft standards, that is terse, and fairly useless. They acknowledge the problem, indicate that a solution exists, then provide only the most basic hint as to how to implement it.

Googling reveals that no-one out there has written anything about how to use it. Hunting through their source code shows that they internally use this mechanism with an in-memory implementation to provide the default functionality. It uses this in the SecurityProtocolFactory, setting the NonceCache to the in memory version if nothing has been supplied. But how do you setup and use a SecurityProtocolFactory in WCF?

I know many will have the reaction that I shouldn't worry about replay attacks, as the transport security will take care of this. However, this is no longer true. Amazingly, some optimisations to the 1.3 version of TLS seem to have quietly removed this feature. See https://blog.cloudflare.com/introducing-0-rtt/

So the questions are:

Am I over thinking this, is it really a problem?
Has anyone actually got the Microsoft implementation to work? If so, how!?
What is everyone else doing? Is everyone just ignoring this problem, unaware of the TLS 1.3 issue?

I have tried setting the NonceCache variable on the localclientsecurity settings, but to no affect.

var sbe = (SymmetricSecurityBindingElement) bec.Find<SecurityBindingElement>();

         if (sbe != null)
         {
            // Get the LocalSecuritySettings from the binding element.
            LocalClientSecuritySettings lc = sbe.LocalClientSettings;
            lc.DetectReplays = true;
            lc.NonceCache = new MyNonceCache();
         }

You can do this at the service level today with WCF, for more information about "Enable Message Replay Detection", please refer to this link:https://learn.microsoft.com/en-us/dotnet/framework/wcf/feature-details/how-to-enable-message-replay-detection?redirectedfrom=MSDN — Ding Peng, Sep 04 '20 at 05:25
Yes, yes you can, but it won't work. That's my point. This will enable replay protection for one server only. If you have 2 or more servers (like everyone will) it doesn't work, as the nonce caches are in memory. You need to specify your own nonce cache .... and then have it work. In my code fragment above I create and assign a nonce cache, but it's never called. There may be a way to get it to execute, but Microsoft (and the internet) are silent on this point. — Kinetic, Sep 04 '20 at 16:45
Replay caches are not shared across a Web farm.The mitigation measures provided in the Microsoft documentation include: Use message mode security with stateful security context tokens, Configure the service to use transport-level security. My suggestion is to use TLS1.3, which is better and more secure than the previous protocol, and disable 0-RTT. — Ding Peng, Sep 15 '20 at 02:12
@DingPeng hi, this is the frustrating thing about the whole situation. In an enterprise this just isn't realistic. If you enable TLS 1.3, at some point, some well meaning person is going to "improve performace" by enabling 0-RTT. It may be years later when everyone involved has moved on. The explicitly provided hint at a solution by MS acknowledges these problems, indicates that they have a solution then leaves you hanging with no way to figure it out. My solution. Bin WCF as unsuitable and switch to WebAPI with HMAC. Got it up and running in 45 minutes. — Kinetic, Sep 17 '20 at 13:57
Anyone equally exasperated can follow the tutorial for WebAPI HMAC implementation here: https://dotnettutorials.net/lesson/hmac-authentication-web-api/ from there it's trivial to implement a centralised nonce cache. — Kinetic, Sep 17 '20 at 14:01
The implementation linked provides an obvious place where it is trivial to centralize the nonce cache and therefore detect replay attacks across multiple servers. HMAC didn't directly solve the problem, rather the simplicity and easy extensible nature of the WebAPI implementation made what had rapidly become an insurmountable problem within the confines of the WCF implementation, trivial. — Kinetic, Sep 25 '20 at 11:44
I would say this allowed me to avoid the problem, rather than solve the problem as was pitched. Hence, I haven't answered my own question in case someone finds out how to implement this properly in WCF. It should be possible, MS did the work, they just neglected to tell anyone about it! — Kinetic, Sep 25 '20 at 11:47
Because the replay cache will not be shared between Web farms, Microsoft only gives mitigation measures, if you can avoid this problem, avoid it. — Ding Peng, Sep 30 '20 at 08:35
No, as per the link in the question https://learn.microsoft.com/en-us/dotnet/framework/wcf/feature-details/preventing-replay-attacks-when-a-wcf-service-is-hosted-in-a-web-farm, they claim to have implemented a solution. They just don't tell you how to make it work. That's the infuriating thing. It might be a couple of lines of code. It's just which lines...no-one knows. — Kinetic, Sep 30 '20 at 12:45

How to secure real-world load-balanced WCF service against replay attacks

0 Answers0