POST larger than 400 Kilobytes payload to a container in Kubernetes fails

Question

I'm using EKS (Kubernetes) in AWS and I have problems with posting a payload at around 400 Kilobytes to any web server that runs in a container in that Kubernetes. I hit some kind of limit but it's not a limit in size, it seems at around 400 Kilobytes many times works but sometimes I get (testing with Python requests)

requests.exceptions.ChunkedEncodingError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer'))

I test this with different containers (python web server on Alpine, Tomcat server on CentOS, nginx, etc).

The more I increase the size over 400 Kilobytes, the more consistent I get: Connection reset by peer.

Any ideas?

Is the request coming through an AWS ALB? Or are you using kube proxy to send the requests? — Blokje5, Apr 09 '19 at 13:40
I use an ingress that creates an ALB. So I use the AWS ALB ingress controller here: https://kubernetes-sigs.github.io/aws-alb-ingress-controller/guide/ingress/annotation/#health-check — StefanH, Apr 09 '19 at 13:56
However, I see the problem when I request from service o service — StefanH, Apr 09 '19 at 13:57
What is your web server and how is it configured/started in the container? — Jan Garaj, Apr 09 '19 at 14:08
it seems it is not related to the container or web server in the container. I get same results with a plain nginx container, with a Tomcat and with a Nodejs. They all crash at around 450 Kilobytes. It's not a fixed size that they crash at, just around that numer. And with the same number it sometimes crash sometimes doesn't. The closest to 450 Kilobytes the more it crashes. — StefanH, Apr 09 '19 at 16:29
curl --http1.1 -v --header "Content-Type: application/json" --request POST -d @bigdata-json https://mycontainerapp — StefanH, Apr 10 '19 at 11:54
I tested with the curl above. I get * transfer closed with 23260 bytes remaining to read. every time different number of bytes remaining to read — StefanH, Apr 10 '19 at 11:55
Does this happens only from service to service requests? Can you try 1) using `kubectl describe pod {pod's name}` to get its IP, 2) enter inside another pod with `kubectl exec -it {pod's name} /bin/sh`and 3) curl the IP of the first pod. If it works then the problem must be related to kube-proxy. — victortv, Apr 10 '19 at 17:49

score 6 · Answer 1 · answered Apr 12 '19 at 09:13

Thanks for your answers and comments, helped me get closer to the source of the problem. I did upgrade the AWS cluster from 1.11 to 1.12 and that cleared this error when accessing from service to service within Kubernetes. However, the error still persisted when accessing from outside the Kubernetes cluster using a public dns, thus the load balancer. So after testing some more I found out that now the problem lies in the ALB or the ALB controller for Kubernetes: https://kubernetes-sigs.github.io/aws-alb-ingress-controller/ So I switched back to a Kubernetes service that generates an older-generation ELB and the problem was fixed. The ELB is not ideal, but it's a good work-around for the moment, until the ALB controller gets fixed or I have the right button to press to fix it.

Several days and tests later seems that this is not related to Docker or Kubernetes in any way, neither with the ALB controller. It is the ALB itself that has this behaviour. I did a test which has a normal AWS EC2 instance behind an ALB and I get the same problem. So it is the ALB. Thanks everyone for chipping in — StefanH, Apr 25 '19 at 16:30

score 1 · Answer 2 · answered Apr 12 '19 at 09:53

1

As you mentioned in this answer that the issue might be caused by ALB or the ALB controller for Kubernetes: https://kubernetes-sigs.github.io/aws-alb-ingress-controller/.

Can you check if Nginx Ingress controller can be used with ALB ?

Nginx has a default value of request size set to 1Mb. It can be changed by using this annotation: nginx.ingress.kubernetes.io/proxy-body-size.

Also are you configuring connection-keep-alive or connection timeouts anywhere ?

answered Apr 12 '19 at 09:53

Ankit Deshpande

3,476
1
29
42

We are in discussions with AWS support, they confirmed the problem is within ALB itself. I'm not configuring keep-alive or timeouts. But exactly the client requests with the same server work in a different scenario when another LB is used. – StefanH Apr 18 '19 at 22:02
Cool. Can you share the solution once it is fixed ? – Ankit Deshpande Apr 21 '19 at 13:36

score 0 · Answer 3 · answered Apr 09 '19 at 17:20

0

The connection reset by peer, even between services inside the cluster, sounds like it may be the known issue with conntrack. The fix involves running the following:

echo 1 > /proc/sys/net/ipv4/netfilter/ip_conntrack_tcp_be_liberal

And you can automate this with the following DaemonSet:

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: startup-script
  labels:
    app: startup-script
spec:
  template:
    metadata:
      labels:
        app: startup-script
    spec:
      hostPID: true
      containers:
      - name: startup-script
        image: gcr.io/google-containers/startup-script:v1
        imagePullPolicy: IfNotPresent
        securityContext:
          privileged: true
        env:
        - name: STARTUP_SCRIPT
          value: |
            #! /bin/bash
            echo 1 > /proc/sys/net/ipv4/netfilter/ip_conntrack_tcp_be_liberal
            echo done

answered Apr 09 '19 at 17:20

BMitch

231,797
42
475
450

Hi BMitch, thanks for your answer. However, I do not have a netfilter folder here:ls: /proc/sys/net/ipv4/netfilter: No such file or directory . Has anything to do with the kernel version? uname -r gives 4.14.106-97.85.amzn2.x86_64 – StefanH Apr 09 '19 at 17:34
anyway, found this:echo 1 > /proc/sys/net/netfilter/nf_conntrack_tcp_be_liberal but it does not solve the problem – StefanH Apr 09 '19 at 17:49
Sorry that didn't solve it, the symptoms looked similar. You'll likely need to start tracing the network traffic like the linked article did to see where the reset is being generated from, and this also depends on your CNI provider. – BMitch Apr 09 '19 at 17:53
For details on how to run tools like tcpdump against container namespaced networks, checkout https://github.com/nicolaka/netshoot#tcpdump – BMitch Apr 09 '19 at 18:09
thanks for the link, I'll use the tool from the link to debug the problem and let you know. One thing I noticed, is that curl comes back with an http response, even on large uploads, while python requests gets connection reset. So it is dependent also on how the http request is fabricated – StefanH Apr 09 '19 at 19:07

victortv · Answer 4 · 2019-04-10T18:34:54.137

0

As this answer suggests, you may try to change you kube-proxy mode of operation. To edit your kube-proxy configs:

kubectl -n kube-system edit configmap kube-proxy

Search for mode: "" and try "iptables" , "userspace" or "ipvs". Each time you change your configmap, delete your kube-proxy pod(s) to make sure it is reading the new configmap.

edited Apr 10 '19 at 18:34

answered Apr 10 '19 at 18:18

victortv

7,874
2
23
27

Victor, thanks for your answer. seems that is not updatable in AWS and I do not understand the full implications of this change. seems like a big one – StefanH Apr 12 '19 at 09:08

score 0 · Answer 5 · answered Apr 16 '19 at 12:34

we had a similar issue with Azure and its firewall which prevents to send more than 128KB as patch request. After researching and thinking about the pro/cons on this approach within the team, our solution is a complete different one.

We put our "bigger" requests into a blob storage. Afterwards we put a message onto a queue with the filename created before. The queue will receive the message with the filename, reads the blob from the storage, converts it into whatever-you-need-to-have as object and is able to apply any business logic on this big object. After processing the message, the file will be deleted.

The biggest advantage is that our API is not blocked with a big request and its long running job.

Maybe this can be another way to solve your issue within the kubernetes container.

See ya, Leonhard

Leonhard, thank you for your idea. But this is not an issue with the way we do things. We have different types of server apps and client apps on different technologies (java, python, nodejs) and different needs. We cannot dictate to the teams that all POST payloads are less then 400KB. The rest of the apps and servers out there can POST a few MB without any issues. So should we. — StefanH, Apr 18 '19 at 22:06

POST larger than 400 Kilobytes payload to a container in Kubernetes fails

5 Answers5