5

Background

I have a setup triggered by Jenkins with the following -

  • The files to be deployed are prepared by phing, by talking to git server and taking a git diff between the required git revisions, in a separate build server, without the involvement of AWS code deploy (as far as I think). The phing build is triggered by Jenkins.
  • I add only the files to be added/modified (based on the git difference of revisions) dynamically to the appspec.yml file. I prepare only the files to be added/modified to a path /home/jenkins/deployment/cd_deploy/codebase/ and I have specified the path /home/jenkins/deployment/cd_deploy/ under "Use custom workspace" option under "Advanced Project Options" of Jenkins project which is basically the location in the build server which needs to be uploaded to the S3 bucket. Note that I would need to delete the files from the instances that are deleted between the two git revisions.
  • Jenkins then triggers AWS Codedeploy with the information about the application name, deployment group of Code deploy that I have configured.

Problem

The files that I dynamically add to the appspec.yml file are getting modified/added in the EC2 instances, as I expect, however, strangely, the files that are to be deleted are also getting deleted. I verified that I have no logic to delete those files written in the beforeInstall hook of my appspec file. I have only a beforeInstall.sh file in my beforeInstall hook, and no other hook. As soon as I remove that hook from the appspec file, the deletion stops. Here is my appspec file -

version: 0.0
os: linux
files:
{Pair of files dynamically generated}
  - source: config/deployment_config.json
    destination: /var/cake_1.2.0.6311-beta/deployment
permissions:
  - object: .
    pattern: "**"
    owner: sandeepan
    group: sandeepan
    mode: 777
    type:
      - file
hooks:
  BeforeInstall:
    - location: beforeInstall.sh

Is AWS Codedeploy somehow talking to my git hosting (I am using gitlab and not even github) and somehow getting the information about the files to be deleted.

Update

I later observed that even after removing the hooks section completely from the appspec.yml file, and deleting the corresponding .sh files, i.e. beforeInstall.sh, afterInstall.sh etc from the central build server (where the S3 bundle is prepared), so that none of my logic and any reference to it is going to the instances, the files that are to be deleted are still getting deleted automatically.

Update 2

Today I found that the files that are modified in between git revisions are also getting deleted automatically. I had logic to dynamically prepare the appspec.yml file. I modified to not add some files. So, there were some files which were there in the git diff, but were not there in the appspec file. As a result, they are getting deleted but not reappearing. Code deploy is automatically doing a cleanup before the deployment, it seems. How do I stop that? I would like to add my custom cleanup logic.

Update 3

Content of beforeInstall.sh -

OUTPUT="$(w | grep -Po '(?<=load average: )[^,]*')"
rm -f /var/cake_1.2.0.6311-beta/deployment/deployment_config.json
path="$PWD"
php $path"/deployment-root/"$DEPLOYMENT_GROUP_ID"/"$DEPLOYMENT_ID"/deployment-archive/beforeInstall.php" ${OUTPUT}

/usr/local/nagios/libexec/check_logwarn -d /tmp/logwarn_hiphop_error /mnt/log/hiphop/error_`(date +'%Y%m%d')`.log #Just run a nagios check, so that counter corresponds to the line in the log corresponding to current timestamp/instant. Do not care about output. Note that we are not even looking for error hinting keywords (and hence not using -p because it needs to be used alongwith), because all we need to care about here is incrementing the nginx counter.

/usr/local/nagios/libexec/check_logwarn -d /tmp/logwarn_nginx_access /mnt/log/nginx/access_`(date +'%Y%m%d')`_`(date +'%H')`.log #Just run a nagios check, so that counter corresponds to the line in the log corresponding to current timestamp/instant. Acceptable http codes are also not being read from deployment_config.json.
printf "\n `date +%Y-%m-%d:%H:%M:%S` End of beforeInstall.sh"  >> /var/cake_1.2.0.6311-beta/deployment/deployment.log
exit 0

And content of beforeInstall.php which is called from the above -

<?php 
file_put_contents('/var/cake_1.2.0.6311-beta/deployment/deployment.log', "\n ".date("Y-m-d H:i:s")." - Load print  ".$argv[1], FILE_APPEND);
$loadData = json_encode(array("load" => intval($argv[1]), "access_error_check_day" => date("Ymd"), "access_error_check_hour" => date("H"))); //error_check_day -> day when nagios error check was last run. We will accordingly check log files of days in between this day and the day of afterinstall (practically this can include a span of 2 days).

file_put_contents("/var/cake_1.2.0.6311-beta/deployment/serverLoad.json",$loadData); //separate from deployment_config.json. serverLoad.json is not copied from build server.
file_put_contents('/var/cake_1.2.0.6311-beta/deployment/deployment.log', "\n ".date("Y-m-d H:i:s")." loadData to config ".$loadData, FILE_APPEND);
?>
Sandeepan Nath
  • 9,966
  • 17
  • 86
  • 144
  • post your before install script. i'm guessing you're deleting them. – mttdbrd Nov 30 '16 at 17:59
  • @mttdbrd please check Update 3 section of question. – Sandeepan Nath Dec 01 '16 at 07:44
  • I do not have any logic there to delete all files. I had some in the afterInstall step, but that was only to delete the files to be deleted (as per git diff). Removing the afterInstall step in the hooks section should make that void. As I have written, even after removing the hooks section completely, files are still getting deleted. – Sandeepan Nath Dec 01 '16 at 13:12
  • Could you post the output of tree above the content directory? – mttdbrd Dec 01 '16 at 15:21
  • I didn't understand. What do you mean by that? – Sandeepan Nath Dec 02 '16 at 06:49
  • In the directory that contains the contents subdirectory, run the 'tree' command and paste the output. Tree is a standard UNIX utility. I want to see all the files in the revision directory tree. – mttdbrd Dec 02 '16 at 06:51
  • Ok. Where do I find the contents subdirectory? – Sandeepan Nath Dec 02 '16 at 07:21

1 Answers1

6

CodeDeploy is designed to deploy applications, not simply copy a specific and constantly different set of files.

As such, before deploying each 'revision', CodeDeploy will first cleanup any files deployed by the previous revision. Let me explain.

So, let's say the previous app deployment uploaded three files:

File A
File B
File C

And then the next deployment only included these files:

File A
File C

Code Deploy will first cleanup the 3 files it deployed on the first revision (A, B and C), and then deploy your new revision... It never simply uploads the files intended, it always cleans up the old files first (determined by looking at the previous 'revision'). This is important because it sheds some light on what seems like mysterious behavior in your case. The result, after deployment is, of course:

File A
File C

Now, it gets interesting if you've manually added files into the mix outside of CodeDeploy. It will only clean things it knows about, and it also won't overwrite files in the current revision if this cleanup phase doesn't remove them. This is often seen when people have manually installed an application, and then tried to do a CodeDeploy to the same folder... there's no previous revision, so nothing to clean up, and then it tries to copy on top of the existing files and will error out. You typically want your target folder to be 'naked' so you can start the revision history properly.

For example, in that previous scenario, if you had previously uploaded the Files A, B and C manually, then the deployment of Files A & B would have failed because it wouldn't know to clean up A, B and C first, and it would then give you an error trying to overwrite the files A and B.

A file (or folder) completely outside the deployment... i.e. not part of either revision, say File D... would be untouched and remain happily there both before and after the deployment without complaint. This is useful for placing data files and such things that may be specific to the deployment but aren't necessarily part of the code base that you don't want to constantly redeploy.

Now, you can do lots of interesting things using the hooks, of course, but it feels like the wrong tool for the job at hand. The hooks are intended for doing things like stop/start services, etc. not to manage the file copy management that is at the heart of what CodeDeploy should be doing for you.

Excluding all files from the app spec (i.e. no files specified) and simply using BeforeInstall and/or AfterInstall steps to perform the copy logic is an approach that may work for some scenarios.

In any case, maybe this better understanding of how CodeDeploy operates might help you craft a solution. I don't think its particularly well documented. My understanding comes from observing and struggling with it myself.

Brett Green
  • 3,535
  • 1
  • 22
  • 29
  • Yes that is how code deploy is designed. After contacting the support team of AWS, I finally bypassed the install step, i.e. instead of mentioning the files to be added/modified in the files section of the appspec file, I added my own logic to add/modify files in the afterInstall step. Since code deploy no more found any files in the files section, it no more did the cleanup. I have added my own cleanup logic as well in the beforeInstall step. Things are working fine now. Add these to the answer and I will accept the answer. – Sandeepan Nath Dec 21 '16 at 10:23
  • 1
    Added that approach to the answer... thanks for the feedback. – Brett Green Dec 21 '16 at 13:59
  • 1
    Great answer! Clear explanation. – mttdbrd Dec 25 '16 at 13:20