2

I am trying to increase capacity for my website, which is growing beyond what my current web server is able to handle. The site hosts on a dedicated web server (Litespeed), and a dedicated database server. It receives over 180,000 visitors per day and 100,000 downloads are made from the site daily.

The site, which is PHP/MySQL based, hosts over 200GB's of user-generated uploads/files shared publicly. For each upload, we store the main/download file along with a preview. This can be a short MP3 file, a short MP4 video (converted to an FLV for preview) and images (jpg) among a few other formats which generally have a thumbnail and larger image previews. We also have a forum with 20GB of attachments.

All dynamic, downloads/static content is hosted on the web server and loads are ~20 throughout the day, the bottleneck being disk wait and CPU (dual 5410).

My host has suggested mirroring the web server with a hardware load balancer in front of them which means keeping bigger, slower disks - or alternatively, running one web server for dynamic pages with faster disks and moving all static content, thumbnails/previews and downloads to a dedicated static server running nginx. This would work fine for serving image previews, however all downloads are served dynamically via a PHP script on the web server, so too are Mp3 and flv preview streams. I can't see how there would be any benefit from doing this for download/streaming content as I assume the web server would still be under heavy load and only JS, CSS and preview images would be served directly from the static server. They also suggested setting up a private cloud; with a virtual web server and load balancer on each server.

Could somebody explain how best to optimize in this scenario and make it flexible to scale up in future if need be?

Other info: our MP3 files are not large files (350-400KB), FLV files are up to 10MB, but some of the other content such as rar/zip files can go up to 30MB and average about 10MB.

Thanks

markxi
  • 43
  • 5

3 Answers3

1

Sorry to say, but I would grab a profiler if one exists for PHP / MySql and optimize. Regardless how I cut the numbers, this site is somethign that should be able to run on an Atom processor with cores happily. 180.000 visitors per day is not THAT much for a well programmed site. For the disc wait - get a proper RAID controller or ZFS and put in 1-2 SSD as caching. Plus get hard iscs - fast many. Datbases are not something you put with performance on a normallow end server. Just to give you an idea - i have a 800gb databas server and I am using 10 discs - 8x Velociraptor ion a Raid 10, 2 SSD in mirror for logs. Disc waits will happen with badly designed subsystems for any database.

So, again, if I were you I would:

  • Start optimizing my PHP code, put in some accelerators. I remember dealing with 400.000 visitors on a dating site year ago on a dual pentium. In an hour during a TV show. With ASP - not compiled.

  • Start laying out a better IO subsystem.

Note: the later may require new hardware. Anyhow. SuperMicro rules here- they have server cases with up to 72 drive bays in 4 rack units height. 24 discs in 2 rack units, all on a SAS backplane. I use one of those (20 discs now total) and it really rocks.

TomTom
  • 51,649
  • 7
  • 54
  • 136
0

You can optimize serving static content through scripts by using X-SENDFILE header.

you probably should split your static content and database to different disks/arrays and experiment a bit with static content array setup. in some cases raid1/raid10 might be better, in other cases raid5 might work better (especially if you're not writing too much) and in some cases just having several individual drives (or raid1 arrays if you need redundancy) with files laid out evenly across all of them might do the trick.

depending on how much memory you have at your disposal, having all the small files, or some of the most frequently requested files (you can get stats from webserver logs) in ramdisk, thus easing up on the disks. (although this really depends on the exact traffic you're seeing, since OS is trying to do this for you with caching, which may, or may not work well)

and of course splitting up the server into two, each serving half the files could help you even without loadbalancer. (this again depends on the traffic)

Fox
  • 3,977
  • 18
  • 23
0

Before investing in hardware or changing your architecture, I suggest trying to find the underlying cause of the performance issues.

You have mentioned disk IO. What is causing this IO? Are you certain it is file downloads, perhaps logging or other activities.

I usually start by cataloging what is writing/reading from disk. Are there any specific programs/functions that tend to cause more problems than others. Try disabling certain tasks, e.g. send apache logs to /dev/null. Stop mail if it is also running on the server.

This is just an example of where I would start.

Many hosts are quick to push more hardware - there's certainly a business motive here but typically it is their only recourse for dealing with performance issues. They typically don't provide the services to do web performance optimization, so the default answer becomes more hardware.

More hardware is expensive and has diminishing returns.

jeffatrackaid
  • 4,142
  • 19
  • 22