We currently deliver large (1GB+) files via a single Apache server, but our Apache server is extremely disk-IO-bound and we need to scale.
My first idea was to simply duplicate this Apache server, however our file library is too big to simply horizontally scale the Apache server N-times.
So my next idea was to have two Apaches (highly-available) in the backend, each with a separate copy of our entire library.. then "N" reverse proxies in front, where "N" grows as our delivery needs grow. Each reverse proxy is very RAM heavy and has as many spindles per GB as possible. The backend Apache servers are more "archival" and low spindle-to-GB.
Is this a good architecture? Is there a better way to handle it?