0

I'm working on a Datalake project composed by many services : 1VPC (+ subnets, security groups, internet gateway, ...), S3 buckets, EMR cluster, Redshift, ElasticSearch, some Lambdas functions, API Gateway and RDS.

We can say that some resources are "static" as they will be created only once and will not change in the future, like : VPC + Subnets and S3 buckets

The other resources will change during the developement and production project lifecycle.

My question is what's the best way to manage the structure of the project ?

I first started this way :

-modules
  .rds
    .main.tf
    .variables.tf
    .output.tf
  -emr
  -redshift
  -s3
  -vpc
  -elasticsearch
  -lambda
  -apigateway
.main.tf
.variables.tf

So this way i only have to do a terraform apply and it deploys all the services.

The second option (i saw some developers using it) is that each service will be in a seperate folder and then we only go the folder of the service that we want to launch it and then execute terraform apply

We will be 2 to 4 developers on this project and some of us will only work on a seperate resources.

What strategy do you advice me to follow ? Or maybe you have other idea and best practice ?

Thanks for your help.

user1297406
  • 1,241
  • 1
  • 18
  • 36

1 Answers1

0

The way we do it is separate modules for each service, with a “foundational” module that sets up VPCs, subnets, security policies, CloudTrail, etc.

The modules for each service are as self-contained as possible. The module for our RDS cluster for example creates the cluster, the security group, all necessary IAM policies, the Secrets Manager entry, CloudWatch alarms for monitoring, etc.

We then have a deployment “module” at the top that includes the foundational module plus any other modules it needs. One deployment per AWS account, so we have a deployment for our dev account, for our prod account, etc.

The deployment module is where we setup any inter-module communication. For example if web servers need to talk to the RDS cluster, we will create a security group rule to connect the SG from the web server module to the SG from the RDS module (both modules pass back their security group ID as an output).

Think of the deployment as a shopping list of modules and stitching between them.

If you are working on a module and the change is self-contained, you can do a terraform apply -target=module.modulename to change your thing without disrupting others. When your account has lots of resources this is also handy so plans and applies can run faster.

P.S. I also HIGHLY recommend that you setup remote state for Terraform stored in S3 with DynamoDB for locking. If you have multiple developers, you DO NOT want to try to manage the state file yourself you WILL clobber each other’s work. I usually have a state.tf file in the deployment module that sets up remote state.

myron-semack
  • 6,259
  • 1
  • 26
  • 38