0

I'm trying to force caching of a very obnoxious piece of PHP script which actively tries to resist caching for no good reason by actively setting all the anti-cache headers:

Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Content-Type:  text/html; charset=UTF-8
Date:          Thu, 22 May 2014 08:43:53 GMT
Expires:       Thu, 19 Nov 1981 08:52:00 GMT
Last-Modified: 
Pragma:        no-cache
Set-Cookie:    ECSESSID=...; path=/
Vary:          User-Agent,Accept-Encoding
Server:        Apache/2.4.6 (Ubuntu)
X-Powered-By:  PHP/5.5.3-1ubuntu2.3

If at all avoidable I do not want to have to modify this 3rd party piece of code at all and instead just get Apache to cache the page for a while. I'm doing this very selectively to only very specific pages which have no real impact on session cookies or the like, i.e. which do not contain any personalised information.

CacheDefaultExpire      600
CacheMinExpire          600
CacheMaxExpire          1800
CacheHeader             On
CacheDetailHeader       On
CacheIgnoreHeaders      Set-Cookie
CacheIgnoreCacheControl On
CacheIgnoreNoLastMod    On
CacheStoreExpired       On
CacheStoreNoStore       On
CacheLock               On

CacheEnable disk /the/script.php

Apache is caching the page alright:

[cache:debug] AH00698: cache: Key for entity /the/script.php?(null) is http://example.com:80/the/script.php?
[cache_disk:debug] AH00709: Recalled cached URL info header http://example.com:80/the/script.php?
[cache_disk:debug] AH00720: Recalled headers for URL http://example.com:80/the/script.php?
[cache:debug] AH00695: Cached response for /the/script.php isn't fresh. Adding conditional request headers.
[cache:debug] AH00750: Adding CACHE_SAVE filter for /the/script.php
[cache:debug] AH00751: Adding CACHE_REMOVE_URL filter for /the/script.php
[cache:debug] AH00769: cache: Caching url: /the/script.php
[cache:debug] AH00770: cache: Removing CACHE_REMOVE_URL filter.
[cache_disk:debug] AH00737: commit_entity: Headers and body for URL http://example.com:80/the/script.php? cached.

However, it is always insisting that the "cached response isn't fresh" and is never serving the cached version. I guess this has to do with the Expires header, which marks the document as expired (but I don't know whether that's the correct assumption). I've tried to overwrite and unset headers using mod_headers, but this doesn't help; whatever combination I try the cache is not impressed at all. I'm guessing that the order of operation is wrong, and headers are being rewritten after the cache sees them. early header processing doesn't help either. I've experimented with CacheQuickHandler Off and trying to set explicit filter chains, but nothing is helping. But I'm really mostly poking in the dark, as I do not have a lot of experience with configuring Apache filter chains.

Is there a straight forward solution for how to cache this obnoxious piece of code?

deceze
  • 483
  • 1
  • 6
  • 20
  • 1
    Do you need to use Apache's mod_cache or would you switch to another piece of software? *EDIT*: And could you include the full request and response headers? – Izzy May 28 '14 at 09:33
  • I don't have a problem replacing mod_cache with something else, however I would very much prefer to leave Apache as the front-end server if at all possible (i.e. not put another proxy in front of it). – deceze May 28 '14 at 09:35
  • I am suspecting some mismatch with the headers.. Can you please post the full request *and* response headers? Also.. Is there a reason, why *Last-Modified:* is empty? – Izzy May 28 '14 at 09:46
  • Yes, it's empty because the PHP script sets it so. :) The requests are nothing special, I've mostly been testing with curl which just sets `Accept: */*` and `User-Agent: curl/7.30.0` and nothing more. – deceze May 28 '14 at 09:49
  • did you try torewrite the expire header with mod_rewrite instead of mod_headers? see f.e. http://stackoverflow.com/questions/7947906/add-expiry-headers-using-apache-for-paths-which-dont-exist-in-the-filesystem – Dennis Nolte May 28 '14 at 10:16
  • @Dennis I tried `Header unset` on basically all headers, without it making a difference. As far as I understand, `Header` is the last thing that actually takes effect in the process chain, so I'm suspecting it's too late for the cache to see. I don't see a different suggestion using mod_rewrite in the question you link to...?! Also, mod_rewrite is processed before the PHP handler AFAIK, so this wouldn't really work either!? – deceze May 28 '14 at 10:24
  • @deceze for how i understand the filter in the manual you can decide when to load the cache via f.e. "AddOutputFilterByType CACHE;INCLUDES;DEFLATE text/html" http://httpd.apache.org/docs/current/mod/mod_cache.html unter the option "Fine Control with the CACHE Filter" Sadly i never did this personally so i cannot check if that is actually working as i think it is. maybe following links help you: different use case but adaptable i think http://www.askapache.com/htaccess/apache-speed-cache-control.html http://mark.koli.ch/set-cache-control-and-expires-headers-on-a-redirect-with-mod-rewrite – Dennis Nolte May 28 '14 at 10:31
  • @Dennis As far as I can see, `Header` can either be the first thing with the `early` flag (before PHP) or the last thing (after cache), so that doesn't work either way. And also as far as I can see I can't put it anywhere else using `AddOutputFilterByType` because it's not a filter. – deceze May 28 '14 at 10:43
  • Apache's caching is pretty primitive, and not the highly configurable tool you expect from Apache's web serving capabilities. If it works for you, then fine, but when it doesn't (as in this case), don't waste time with it. Try varnish or squid. – mc0e May 28 '14 at 17:23

1 Answers1

1

The empty Last-Modified: header makes the request uncacheable. So the first thing to do is to generate one, one solution is to use php auto_prepend_file:

Create a file prepend.php with a content like this:

  <?php header("Last-Modified:" . gmdate("D, d M Y H:i:s"), " GMT"); ?>

Then add in your vhost configuration the directive: php_value auto_prepend_file path_to_prepend.php

At this point you have to verify the server response has a correct Last-Modified: header. If not, we won't be able to cache it, if yes, maybe your work with mod_headers and mod_cache is working now?

If not, you can use squid and apache like this:

Apache Configuration

In your correct vhost, just enable mod_rewrite and use it to redirect the traffic you want to cache:

<VirtualHost your_current_virtual_host:80>
 ServerName your.site.com
 ..
 RewriteEngine on

 # This enables the caching server to see the request  as http://your.site.com/..
 ProxyPreserveHost on

 # This should be at VirtualHost level, not &ltDirectory> or .htaccess


 # The DoSquid env variable decides if we send the request to the cache server
 # Adjust it for your needs
 RewriteRule /the/script.php - [E=DoSquid:Yes]

 # POSTs are not cacheable
 RewriteCond %{REQUEST_METHOD} ^POST$ 
 RewriteRule .* - [E:DoSquid:No]

 # Feel free to add any rule which makes sense for your needs

 # Requests from localhost are calls from the "primary" vhost ( see below )
 RewriteCond %{REMOTE_ADDR} ^127\.0\.0\.1$ [E:DoSquid:No]

 RewriteCond %{ENV:DoSquid} ^Yes$
 RewriteRule /the/script.php http://ip_of_caching_server/this/script.php [P,L,QSA]

 ..
 ..
<VirtualHost/>

# This VirtualHost will be accessed by your caching server as the primary server for your site
# Port 8009 can be anything, it just must be a separate virtual host

<VirtualHost your_current_virtual_host:8009>
 ServerName your.site.com
 ..
 RewriteEngine on

 # Here a make a massive usage of mod_headers in order to have a cacheable response
 # Needless to say, this might completely break your application. The responses are
 # Completely anonymized

 Header unset Set-Cookie
 Header unset Etag
 Header unset Pragma
 RequestHeader unset Cookie

 # Now fix the Cache-Control header..
 Header merge Cache-Control public
 # The max-age is a pain. We have to set one if it's not set, and we have to change it if it's 0
 Header merge Cache-Control "max-age=bidon"
 # Case when we have: Cache-Control max-age=.., ....
 Header edit  Cache-Control "^(.*)max-age=(.*)max-age=bidon, (.*)$" $1max-age=$2$3
 # Case when we have: Cache-Control yyy=bidon, max-age=.."
 Header edit  Cache-Control "^(.*)max-age=(.*), max-age=bidon$" $1max-age=$2
 # Now Replace the value if there was not a max-age, set to 10mn
 Header edit  Cache-Control "max-age=bidon" "max-age=600"
 # Now Replace the value if there was a max-age=0, set to 10mn
 Header edit  Cache-Control "max-age=0" "max-age=600"

 # Remove Cache-Control parameters which prevent caching
 Header edit Cache-Control "no-cache, " ""
 Header edit Cache-Control "no-store, " ""
 Header edit Cache-Control "post-check=0, " ""
 Header edit Cache-Control "pre-check=0, " ""
 Header edit Cache-Control "must-revalidate, " ""

 # The request is now forwarded to the first vhost. It will not loop because we do not cache requests from 127.0.0.1
 ProxyPreserveHost on
 RewriteRule ^(.*)$ http://127.0.0.1/$1 [P,L,QSA]

 ..
 ..
<VirtualHost/>

Cache server Configuration

You can probably use anything: squid, apache, varnish. with squid you have to configure it as a reverse proxy and declare cache_peer your.site.com parent 8009 0 no-query originserver .. Maybe you can just enable mod_cache in the second vhost to achieve what you want.

Olivier S
  • 2,739
  • 1
  • 14
  • 14
  • Thanks, one of these things ought to work. Haven't had the time yet to test it out, but I'll award you the answer already anyway. :) – deceze Jun 02 '14 at 09:43
  • I was able to gain control of the cache handling via `php_flag output_buffering on` and `php_value auto_append_file foo.php`, in which I'm unsetting all those silly headers conditionally based on an environment variable... Thanks again! – deceze Jun 03 '14 at 12:42