11

I am currently caching dynamically generated PHP pages by saving them to a database with an expiry time field. If the page is requested again, the program checks for a unexpired cached version of the page to serve up, and only regenerates the page if it can't find one.

This works well - but would it put less load on the server to save the cached pages as files instead saving to the database? I could use a naming convention in the files to handle the expiry time.

If it is faster and less server taxing to read/write from a file instead of a database, I'll switch to that. Does anyone know which is faster / best practice?

Dan
  • 2,073
  • 3
  • 22
  • 30

4 Answers4

8

If it is faster and less server taxing to read/write from a file instead of a database, I'll switch to that. Does anyone know which is faster / best practice?

The fastest is to use static files, because you can then issue the cache without even starting PHP (using RewriteRules). It won't scale properly if you've multiple front ends, however.

The next best thing is to store it in memory, using Memcache for instance.

The least preferable is to use SQL. If you stick with it, at least do your hard-drive a favor by using the Memory storage engine or equivalent (e.g. an unlogged table that lives in a tablespace stored on a RAM disk, if you're using PostgreSQL).

Denis de Bernardy
  • 75,850
  • 13
  • 131
  • 154
  • Thanks Denis there are some great ideas here - I'll research them. – Dan May 19 '11 at 04:05
  • 1
    As a follow-up, I used microtime() before and after both alternatives, and the script could create a file, write to it and close it 75% faster than it took to execute the insert statement into MySQL. It's in the milliseconds but at peak traffic this will really help... especially as I am experiencing failed MySQL connections when too many users get on at once. – Dan May 19 '11 at 05:02
  • 1
    Don't forget to count file locking when measuring the file write performance. The best way to avoid them, in my experience, is to create a temporary file and then rename it to its final location. This makes it a bit slower on writes but it avoids concurrent write problems. – Denis de Bernardy May 19 '11 at 05:06
  • 1
    File caching on /dev/shm or other ramdrive mountpoint, is also very fast. A simple implementation can be done in 100 lines – twicejr Nov 18 '13 at 21:23
  • Would be great to see some stats or references to back up these points, especially if someone needs to convince a team to switch. – Simon East Sep 16 '14 at 12:08
  • @Simon: I'm sure you'll find plenty of stats comparing memory reads vs FS reads vs DB reads and how slow PHP is compared to serving static HTML files. It's mostly common sense, though. In much the same way that 1 < 1 + 1, Apache alone is faster than Apache + PHP. Reading data from memory alone is faster than putting HD data in memory + reading this data from memory. And reading from the same server is also faster than identifying the server to read from + reading off of it remotely. – Denis de Bernardy Sep 16 '14 at 16:31
  • @Simon: A potential nitpick one might make to the answer as written is whether serving a Memcached-based cache from PHP is materially slower than skipping PHP to serve a purely FS-based cache. Imho, the question is moot in practice. The moment you need to scale to more than one server you'll want something that you can shard without needing to reinvent the database (when worrying about file locks and such), so you might as well use something like Memcached (or SQL, or NoSQL) to begin with. As a bonus, it allows to cache partials transparently. Put another way, you want Memcached in practice. – Denis de Bernardy Sep 16 '14 at 16:42
3

Both options utilise the file system, as (assuming you're using MySQL without MEMORY/HEAP tables) database records are still stored in files.

If you have an active database connection at the time of requesting the cached data, I'd stick with the database.

pkavanagh
  • 1,036
  • 9
  • 4
  • Thanks! Yes a connection would already be open as session management is via database too and I would have just authenticated the user before assessing if a cache exists. Sounds like I should just leave it. – Dan May 19 '11 at 03:46
  • Won't Memory/Heap still write to the WAL? – Denis de Bernardy May 19 '11 at 03:47
2

There is not one real answer to your question, it really depends on the number of queries and the cache the database uses in contrast to the time it takes to parse a file. A lot of other factors may take place as well.

However you can use PHP extensions such as Memcached as Denis suggested. This works even better in combination with a database by using a framework like Doctrine. This makes it easy to manage your data using the database. And serve the actual data in production by caching query results.

Tim
  • 5,521
  • 8
  • 36
  • 69
0

Here's something interesting that I have never tried but have heard good things about. You can use Google Spreadsheet API to act as a database through http requests. It sounds (in theory at least) like it is an ideal resolution to a problem like this. Not an answer to your question just another option.

locrizak
  • 12,192
  • 12
  • 60
  • 80