1

So I am about to get started deploying a complicated infrastructure as a hosting environment for several high traffic sites. I will be using ec2 for the servers and other different services from AWS. Please, take a look at my diagram and give me some advice. Diagram of AWS infrastructure

Facts about this deployment:

  1. Hosted on these machines will be mainly LAMP or LNMP (well the M is in the DB server) stacks.
  2. Planning to use GlusterFS to make sure all the nodes have the same information for the load balancer.
  3. Was planning to use Ubuntu for all the nodes but am comfortable with CentOS as well.
  4. Using spot instances at various prices for the automatic scaling.
  5. Eventually going to move to Chef or Puppet to manage all of this but I don't know how to use it yet.
  6. Going to use nginx either as a proxy or as the only webhost.
  7. Planning to use one small instance as the main one and a micro as the secondary node. Spots and on demands for auto-scaling.
  8. I do not have it pictured but I plan to use replicated EBS volumes on each node

Some questions I have:

  1. Any problems you see in my setup?
  2. What order should I run/setup the different components and software?
  3. Do you think the small instance is too small to start with if I'm even considering this deployment (meaning is this just too complicated and should I just bump up the server power)? Basically, I just want to always have backups of servers to handle the load and such and I figured I could just save some money with clustering.
  4. Any other advice?

I really appreciate all the feedback in advance.

EDIT: I just found this diagram that is similar to mine from Amazon itself (I think I modeled after something similar). AWS Diagram

ClakeB
  • 11
  • 2

2 Answers2

0

The only thing which stands out here is the use of glusterFS - are you really intending on storing application data (rather than just static content and code) as files? If not then just replicating the files on deployment (rsync, unison or just a direct VC checkout) will solve the problem without the problems associated with a cluster filesystem. While EBS does simplify this, it still becomes a bottleneck on the system.

OTOH if you really are storing application data in files....then how???!!! PHP does not have the sophisticated lock management facilities that you need for concurrent access on a setup like this.

How much traffic are you proposing to handle?

symcbean
  • 21,009
  • 1
  • 31
  • 52
  • 1 million visitors a month on the main site (still growing) and one of the other sites has a little less. The others are significantly less. I guess the problem I see is when I need the power of another server how does the newest server get the files? I don't really want it to be waiting on rsync. Am I correct in this concern? The extra servers would be coming on automatically based on the load of the main server. rsync would take time to get up to sync correct? Thanks for your help. – ClakeB Dec 16 '11 at 16:33
  • !1 million visitors" - not a very meaningful statistic. What about average number (and stddev) of resident requests? If you're not storing data in the files then rsync allows you to timeshift the cost of replication (and take nodes offline during refresh) – symcbean Dec 19 '11 at 10:20
  • I just built a deployment like this. I'd be curious to hear how gluster ended up fitting in. If I could go back in time I'd probably look at OpsCenter to see if it would work as a replacement for Chef. – jorfus May 06 '15 at 21:02
0

The setup looks fine, except that there are many places where I think you might be assuming.

Use config. mgmt. tools like Puppet/Chef from start, to build out your infrastructure.

As @symcbean asked, you really wont need GlusterFS for quick sync up of all the app files. Config. mgmt. tools like Chef, puppet can setup a complete copy of the app in seconds.

You really shouldn't decide on what type of Instance you want upfront. Deploy the application, then run some performance tests and see how it responds, based on which you might be required to move to higher instance types.

Remember, you should scale across multiple availability zones ( amazon term for datacenters ) to build out redundancy. Use a load balancer like ELB, which is resilient to zone failures.

Shyam Sundar C S
  • 1,063
  • 8
  • 12