AWS CodePipeline to only deploy files that have changed since previous deploy and not simply replace application

Question

TLDR; How do I only push to CodeDeploy the changes that have been made from CodeCommit?

I built a simple CI/CD Pipeline with CodePipeline in which I commit to CodeCommit and then it deploys the code using CodePipeline to my Elastic Beanstalk application.

The problem is that it seems like it simply copies the entire application and puts it online. In this way, it removes all of the logs that I had previously on the server. For example, anything that was in .gitignore will not only not be submitted to git, but if this was previously on the server, then it will be removed.

Any comments or suggestions are greatly appreciated! ❤️

Thanks!

Also, appart from the answers, can you clarify? CodeDeploy can't deploy to ElasticBeanstaslk. EB instances do not run CD agent. So what exactly is the role of CodeDeploy in your setup? — Marcin, Jun 23 '20 at 10:18
Hi Marcin, actually, CodeDeploy can deploy to Elastic Beanstalk :) Here are the available Deploy providers: AWS CloudFormation, AWS CodeDeploy, AWS Elastic Beanstalk, AWS Service Catalog, Alexa Skills Kit, Amazon ECS, Amazon ECS (Blue/Green), and Amazon S3. — Brad Ahrens, Jun 23 '20 at 10:24
I seems you are confusing CodePipeline and CodeDeploy. Its CodePipline which deploys to EB, not CodeDeploy. — Marcin, Jun 23 '20 at 10:26
Ahhhh... you're right. You are absolutely right. Let me edit that above. Sorry for the confusion! — Brad Ahrens, Jun 23 '20 at 10:27
No problem. This is a very common misconception that CD deploys to EB. — Marcin, Jun 23 '20 at 10:33
copy modified files to an empty bucket using lambda and use that bucket as your source. clear out the bucket after successful deployment — dev, Jan 02 '21 at 19:22

score 2 · Answer 1 · answered Jun 23 '20 at 10:13

In this way, it removes all of the logs that I had previously on the server

EB environment, whether single-instance or load-balance always runs in autoscaling group. This means that they can be terminated at any time, e.g. due to AZ re-balance or due to changes to your EB environment configuration

Thus you should build all your applications to be stateless and do not depend on any stored information on them. Sooner or later this will lead to issues (some of which you are experiencing now).

Very good point. Okay :) I will make sure to separate the logs, etc. from the app itself so that this isn't a problem. :) Thanks! — Brad Ahrens, Jun 23 '20 at 10:25

score 1 · Answer 2 · answered Jun 23 '20 at 10:09

If you wanted to do this upon a CodePipeline activation you would need to have a first stage that prunes based on the difference of commits (presumably using Lambda). This would then replace the artifact that goes to your instances.

Remember that CodeDeploy will replace the contents of the folder with the contents of your artifact so you'll need to account for this.

However this is generally bad practice, in fact you should never be reliant on a specific server especially for logging.

Instead architect your servers to ship your logs to a distributed service such CloudWatch Logs, an ELK stack or a third party supplier. Always be prepared for your infrastructure to fail, by allowing servers to be easily replaced it will allow your applications to be more resilient.

Very good points and good idea. I'll push the logs to CloudWatch! — Brad Ahrens, Jun 23 '20 at 10:21

AWS CodePipeline to only deploy files that have changed since previous deploy and not simply replace application

2 Answers2