2

Citrix has identified a TCP communication issue we are experiencing as due to active TCP Small Window Attack Protection (TCP-SWAP) on the NetScaler load balancer between the client and servers. The load balancer will intermittently drop TCP connections and the advice from Citrix/the article is to disable TCP-SWAP. As the NetScaler is used for other system traffic besides mine, disabling this setting may create a global incident and expose the NetScaler to potential Small Window Attacks.

An alternative to disabling TCP-SWAP would be to ensure the client requests are not classified as Small Window Attacks.

The particular affected client sends multi-part requests over the NetScaler to the servers. The requests are headers, an XML part and a file part, separated by standard generated boundaries. NetScaler is intermittently flagging certain requests with attachments sized 62-66kb as Small Window Attacks, and are preventing the last part of the request to be received by the server. Subsequent identical requests (including using the same boundaries) succeed; the scenario is impossible to replicate on demand, but can be replicated with volume.

During a qualifying scenario where the request is treated as a Small Window Attack, the entire client request is passed to the load balancer, then the client pends for a server response. The server receives the header, XML part and ~50% of the file part, then pends for the rest of the request file part. The load balancing NetScaler never transmits the rest of the file part. The client and server ultimately time-out waiting.

Code reviews on the client and server sides reveal no issues with the code. Removing the NetScaler from the communication path fixed the problem; there were no issues for an extended timespan when it was no longer between the client and servers. Unfortunately, the load balancer is required.

The issue is that client requests are being mis-classified as Small Window Attacks by the NetScaler due to TCP-SWAP, which prevents the server from receiving the final part of the request. Other than disabling TCP-SWAP on the NetScaler, what can the client change to prevent requests from being classified as Small Window Attacks?

The client is a Java 8 (IBM SDK 8.0-6.11-linux-x86_64) application running under IBM Liberty (WAS Liberty 20.0.0.7) on Linux (Oracle Linux Server 7.8), and it uses org.apache.httpcomponents.httpclient to communicate via PoolingHttpClientConnectionManager with normal settings for timeout, security and connection management. Is there a configuration setting to apply to either Liberty or HttpComponents to prevent requests from being identified as Small Window Attacks?

The post code follows; it is atypical:

public HttpResponse post(URL url, Map<String, Object> parameterMap) throws Exception {
    
    HttpPost httpPost = new HttpPost(url.toString());

    MultipartEntityBuilder builder = MultipartEntityBuilder.create();

    for (Map.Entry<String, Object> entry : parameterMap.entrySet()) {

        if (entry.getValue() instanceof String) {
    
            builder.addTextBody(entry.getKey(), (String)entry.getValue());
                    
        } else {
            
            String partName = getNextPartName();
                
            if (entry.getValue() instanceof File) {

                builder.addBinaryBody(partName, (File)entry.getValue(), ContentType.APPLICATION_OCTET_STREAM, entry.getKey());

                } else 
                if (entry.getValue() instanceof InputStream) {
                    
                    builder.addBinaryBody(partName, (InputStream)entry.getValue(), ContentType.APPLICATION_OCTET_STREAM, entry.getKey());
                    
                } else 
                if (entry.getValue() instanceof byte[]) {
                    
                    builder.addBinaryBody(partName, (byte[])entry.getValue(), ContentType.APPLICATION_OCTET_STREAM, entry.getKey());

                } else {
                    
                    throw new IllegalArgumentException("Cannot attach entry " + entry.getKey() + " with Object of class " + entry.getValue().getClass().getName());
                }
            }
        }
        
        httpPost.setEntity(builder.build());            
    }
    
    return new HttpResponseImpl(getClient().execute(httpPost), httpPost);
}
JoshDM
  • 4,939
  • 7
  • 43
  • 72
  • Anything special about the client OS here? In the netscaler doc, things are doomed much earlier than in your scenario. – covener Sep 10 '20 at 12:36
  • @covener - nope; just Oracle Linux Server 7.8 – JoshDM Sep 10 '20 at 13:26
  • @covener - it's a given that the NetScaler is the problem item here and that active TCP-SWAP is the culprit. Just trying to find a way to get the client requests to not qualify as TCP-SWAP without having to disable TCP-SWAP. – JoshDM Sep 10 '20 at 13:27

1 Answers1

1

Assuming you're not actually under a TCP small window attack, then the fact that your TCP window size is so small suggests the most likely issue is that the client-side Linux kernel is deciding that it's under pressure and activating flow control (i.e. setting a small window). I suggest investigating whether your client network stack is well tuned and unsaturated; otherwise, you might try different congestion control algorithms or other cwnd tuning. For example: https://publib.boulder.ibm.com/httpserv/cookbook/Operating_Systems-Linux.html#Operating_Systems-Linux-Networking-TCP_Congestion_Control

kgibm
  • 852
  • 10
  • 22
  • We're not actually under a TCP small window attack; these are internal systems and the client request that is being flagged is an expected internal process with expected data. – JoshDM Sep 10 '20 at 13:48
  • The entire process is inside a data center; it's currently using TCP Congestion Control algorithm cubic, which I believe is preferred due to the low latency. – JoshDM Sep 10 '20 at 13:58
  • Then I think the next question is why is the client sending a small window size. If the review of initcwnd didn't show anything, then I recommend engaging Oracle Linux support as this gets into kernel decisions. – kgibm Sep 11 '20 at 03:00
  • we were also seeing this same behavior when both the client and server were Windows servers running Wildfly, so I'm leaning towards this not being an O/S concern. – JoshDM Sep 29 '20 at 16:46
  • It turns out that disabling TCP-SWAP on the NetScaler did not solve the issue, which probably means these requests are NOT qualifying as small window attacks and the issue is elsewhere. Pending further investigation from Citrix. – JoshDM Sep 30 '20 at 15:29
  • Given that it happens on multiple OSes then that suggests this is a consequence of either the sender or receiver being under pressure and thus deciding to use a small window. I've never personally seen a small window on the handshake, though. I don't know if there's a part of the TCP spec around that. Can you edit your question with a reproduction of the problem and a Wireshark screenshot of one of the small window handshakes? – kgibm Sep 30 '20 at 17:09