0

Need to make an AWS deployment decision. A lot of this tech (docker, beanstalk) is pretty new so I don't know best practices (and I'm also foggier than I'd like to be on networking and security).

Tech details: I have a docker application from a client (python w/fastapi) that takes POST requests and spits out machine learning model results. I can build the application locally, but need to deploy it on AWS in a scalable fashion. I managed to deploy it to Elastic Beanstalk (similar to this tutorial), which gives me scalability but gives me a public url http://myapp.eba-ri5rfu4f.us-central-1.elasticbeanstalk.com which already has random bots sending GET .env requests. It doesn't need to interact with my IOT network, just other cloud apps, probably a lambda function.

What's the simplest way of deploying this? Here's my understanding of the options (you don't have to address all of these bad ideas, just want to demonstrate I've given some thought to solutions):

1. Add token: The POST input json could also expect a security token and return 404 if otherwise. Problems: Requires modifying docker application source code (don't want to have to do this!). Also still open on the internet serving malicious GET requests all day.

  1. Build VPC: Could make a VPC that all our cloud apps use. Problems: Don't know how to do this, or if it'll even work. Maybe I'll need one anyway? But feel like I'm adding a whole layer of architecture to maintain just so one piece gets some security.

3. Security groups: Maybe I just need to add my Elastic Beanstalk to a security group, allow only approved IP addresses through the firewall and that solves everything. Problems: Don't think this works, it's not that simple.

4. Deploy as lambda function: It'll only interface with whatever resource triggers it, no need for a public URL. Problems: Requires modifying docker source code to work with a lambda handler instead of an API. Plus feels like putting a hat on a hat, running a server in a docker container then deploying it in a "serverless" environment. Does it have to spin up a server every time the function is invoked? (Also I already tried to do this with a 2 dockerfile solution I found and gave up after it didn't work.)

5. Do nothing: Our data model is meaningless to everyone, stop wasting time on this. Problems: A malicious actor could still figure out how to do proper requests and charge us thousands in AWS fees. Don't know why someone would do this, but it just looks and feels bad to just make intellectual property public.

Appreciate any advice or feedback on this problem. I know it's an open ended question, I just need to brainstorm and confirm I'm not missing an obvious solution.

UPDATE

Using a VPC: Turns out AWS has a default VPC, so I think the best solution is to add my Beanstalk to one. I created a new beanstalk environment, this time selecting a VPC subnet under Configuration->Network. It still has a public URL though:

Instance subnets: subnet-42ttr89
Public IP address: enabled
VPC: vpc-5910921
Visibility: public

Think I'm closer, still stuck though since I don't see a way to change these settings and make it private.

1 Answers1

2

Solution using Security Groups (I think):

Create an elastic beanstalk environment normally, then modify it's Security Group to only allow access from the IP of a Network Interface of the security group of your trusted AWS resources.

  • I created a normal elastic beanstalk environment (no VPC), launched it, and confirmed its public URL was working.
  • Under EC2>Network&Security I then found the security group for the new environment's load balancer (there's two security groups, pick the one with load balancer in the name)
  • Under Edit Inbound Rules, there should be one rule allowing HTTP for 0.0.0.0/0 (all IP addresses). I replaced this with the public IP address of the Network Interface for the security group of an EC2 instance (under EC2>Network&Security>Network Interfaces). (Limiting this rule by security groups or subnet IPs doesn't work).

I then ran a python script performing a post request to my public URL while SSH'ed into an EC2 instance in the same security group, and it works. But accessing it from my browser times out! (A bit worried that the IP isn't static, will see if this works long term).