6

I currently run a small website which is being used to host large amounts of (generated) static HTML. The problem is that disk space is limited, and the HTML is growing by 1GB a week. (The files are grouped into directories, with around 500 files each, and total to 10B-100MB. The files compress very well [under 10%])

Ideally I'm looking at a way to compress all the HTML files on the HDD while still being able to serve them easily.

Matthew
  • 175
  • 4
  • 1
    One possibility is to bzip the directories and use Fuse to mount each as a directory, but I would prefer a not-so hackish solution – Matthew May 28 '09 at 03:20

7 Answers7

6

The mod_gunzip mentioned by Matt Simmons doesn't appear to exist for Apache 2.x. The replacement mentioned by the developer is to use Apache Module mod_ext_filter. I haven't tested, but it looks like it should be pretty easy to build a filter that decompresses files to be served.

The other thing you should consider is that most current browser accept content that has been gzip compressed. It may be possible for you to gzip your files and serve the gzip'd files without doing anything special. Here is a link with some details.

Zoredache
  • 130,897
  • 41
  • 276
  • 420
4

Compressed Filesystem - Another solution is to deal with the compression a layer below the web server and files - at the filesystem level.

I've not done with myself, but you can try out something like fusecompress - so separate out your www if you already haven't, and make it a compressed fs of some sort.

Obviously this will cost you in some performance, but if the processor is decent, then it might be ok.

khosrow
  • 4,163
  • 3
  • 27
  • 33
  • I looked at Fuse solutions, but they all seemed to be very young, and not really ready for general use – Matthew May 29 '09 at 02:19
1

Have you looked at mod_gunzip? I'm too new to link to it, but a google search should point you in the right direction.

Matt Simmons
  • 20,396
  • 10
  • 68
  • 116
  • Is this the url? (http://oldach.net/) it has both a mod_gunzip and mod_bunzip2 package. These packages appear to be for apache 1.x only. – Zoredache May 28 '09 at 05:29
  • You're right. I did find this while researching, though: http://log.samat.org/2005/10/06/trying_to_emulate_mod_gunzip_with_apache_2_filters – Matt Simmons May 28 '09 at 11:48
1

You could wrap all of your pages in a script that looks something like this:

bzcat $1.bz2

Where $1 is the requested file. A quick PHP/Perl/whatever script can pretty effectively pull the path out of the request variables, and there you go.

You do lose the speed of static files, but that might not matter for your use case.

Bill Weiss
  • 10,979
  • 3
  • 38
  • 66
0

One answer would be to run the website on a Windows host and simply compress the NTFS filesystem

Another option would be an OpenSolaris system running ZFS

Kevin Kuphal
  • 9,134
  • 1
  • 35
  • 41
0

Most browsers understand gzipped HTML pages. One solution is to gzip each page and to have your web server add a "Content-Encoding: gzip" header to each response.

laurentb
  • 149
  • 3
0

gzip all the files and use Options +MultiViews if using Apache.