My current web project has the following characteristics:
- A website which is basically a read-only archive of information. There are no interactive actions a visitor could do.
- All pages (currently around 15k) of the website are pre-generated HTML files and graphics that are created on another machine.
- The motivation behind this approach: since there is no dynamic processing and no database, the complexity of several web security aspects is much lower. Apart from that, the hope is to achieve good performance (or in other words, reduce runtime costs), since the whole website is a big single cache serving static files.
However, I underestimated the performance impact of keeping a large amount of files in a small amount of directories. Currently, the URLs of the website are mapped directly to the pre-generated directory structure on the file system. E.g. the address domain.com/categoryA/...
maps to the directory webroot/pages/categoryA/...
which contains a large amount of HTML pages, and the reading of files becomes slower and slower with every additional file that is added to that directory.
How could I solve this problem? Are there any webservers or server side technologies that especially address the problem of serving large amounts of static pages? A SEO-friendly URL structure should be preserved. Apart from that, I'm open for any suggestions.