4

In my application:

  • ASP.NET Core 3.1 with Kestrel
  • Running in AWS ECS + Fargate
  • Services run in a public subnet in the VPC
  • Tasks listen only in the port 80
  • Public Network Load Balancer with SSL termination

I want to set the Security Group to allow inbound connections from anywhere (0.0.0.0/0) to port 80, and disallow any outbound connection from inside the task (except, of course, to respond to the allowed requests).

As Security Groups are stateful, the connection tracking should allow the egress of the response to the requests.

In my case, this connection tracking only works for responses without body (just headers). When the response has a body (in my case, >1MB file), they fail. If I allow outbound TCP connections from port 80, they also fail. But if I allow outbound TCP connections for the full range of ports (0-65535), it works fine.

I guess this is because when ASP.NET Core + Kestrel writes the response body it initiates a new connection which is not recognized by the Security Group connection tracking.

Is there any way I can allow only responses to requests, and no other type of outbound connection initiated by the application?

Victor
  • 23,172
  • 30
  • 86
  • 125
  • I always wonder: Can't you ask AWS support? – Tony Stark May 24 '20 at 00:50
  • Actually this is a pretty interesting question IMO. I would be happy if there is a public answer to this somewhere rather than in some private AWS support ticket. – Martin Löper May 24 '20 at 17:30
  • I would also expect the Security Group to track the inbound connection, but as there is an NLB involved and they tend to do some magic, I would try to replace the NLB with an ALB and check if this solves the issue. If so, then there is something about the NLB which breaks the SG conntrack assumption. – Martin Löper May 24 '20 at 17:32
  • Could you install Wireshark on the client and save a successful access (with all the ports open) as well as an unsuccessful access? This should help clarify if it's really a connection tracking issue or if you have another problem. If you allow outbound TCP connections from port 80 it also fails... tells me it's not a connection tracking problem. – Dan M May 30 '20 at 18:56

1 Answers1

3

So we're talking about something like that?

Client 11.11.11.11 ----> AWS NLB/ELB public 22.22.22.22 ----> AWS ECS network router or whatever (kubernetes) --------> ECS server instance running a server application 10.3.3.3:8080 (kubernetes pod)

Do you configure the security group on the AWS NLB or on the AWS ECS? (I guess both?)

Security groups should allow incoming traffic if you allow 0.0.0.0/0 port 80.

They are indeed stateful. They will allow the connection to proceed both ways after it is established (meaning the application can send a response).

However firewall state is not kept for more than 60 seconds typically (not sure what technology AWS is using), so the connection can be "lost" if the server takes more than 1 minute to reply. Does the HTTP server take a while to generate the response? If it's a websocket or TCP server instead, does it spend whole minutes at times without sending or receiving any traffic?

The way I see it. We've got two stateful firewalls. The first with the NLB. The second with ECS.

ECS is an equivalent to kubernetes, it must be doing a ton of iptables magic to distribute traffic and track connections. (For reference, regular kubernetes works heavily with iptables and iptables have a bunch of -very important- settings like connection durations and timeouts).

Good news is. If it breaks when you open inbound 0.0.0.0:80, but it works when you open inbound 0.0.0.0:80 + outbound 0.0.0.0:*. This is definitely an issue due to the firewall dropping the connection, most likely due to losing state. (or it's not stateful in the first place but I'm pretty sure security groups are stateful).

The drop could happen on either of the two firewalls. I've never had an issue with a single bare NLB/ELB, so my guess is the problem is in the ECS or the interaction of the two together.

Unfortunately we can't debug that and we have very little information about how this works internally. Your only option will be to work with the AWS support to investigate.

user5994461
  • 5,301
  • 1
  • 36
  • 57