Your instance will only be "cloned" if you have a recent AMI (Amazon Machine Image) taken of the instance. There will probably be some changes to your filesystem since that snapshot was taken, so it's a good idea to use AWS userdata to create a Bash/CloudInit script that will trigger an update on the areas that will change, e.g. codebase, media etc.
Options for updating certain areas can be either of (listing pros and cons):
- Central storage point on S3
- Permissions to access bucket are managed via the IAM role you assign to your instances
- Fast I/O bandwidth between AWS services
- Consistent uptime
- Con: Unless you use bucket versioning, you don't have the flexibility that source control gives you for revisions
- A source control (Git, Subversion) repository for bootstrapping data:
- Allows you to dynamically update your bootstrapping scripts via source control, have contributions and history for it etc
- Con: Requires a (potentially) external connection to your Git host, permissions and security group configuration to allow this
- Con: Probably a bit slower than S3
Here's an example of a bootstrapping script that you could apply to your launch configuration to trigger it to perform the bootstrapping tasks dynamically. Note that userdata scripts are executed as the root user.
#!/bin/bash
# Update your packages
yum update -y
# Get/execute bootstrapping
cd /tmp
git clone ssh://your-git-server/bootstrapping-repo.git
chmod +x /tmp/your-repo/bootstrapping.sh
# Execute it
/tmp/your-repo/bootstrapping.sh
# Remove the bootstrapping script remnants
rm -rf /tmp/your-repo
This method allows you the flexibility to update your "bootstrapping-repo" as often as you need without having to create new AMIs regularly.
For background, I use S3 in my userdata to grab relevant SSH keys, host files etc, and Git for the bootstrapping repository.
It's also a good idea to keep your AMIs up to date as regularly as possible and saved to your launch configuration so that new instances spun up won't have to spend too long updating themselves. Whether you do this manually every so often or write a script to do it via the API or CLI is up to you.
FYI: the output of the userdata and subsequent scripts that are called on instance launch will be logged to the file /var/log/cloud-init-output.log