7

My proxy cache path is set to a very high size

proxy_cache_path  /var/lib/nginx/cache  levels=1:2   keys_zone=staticfilecache:180m  max_size=700m;

and the size used is only

sudo du -sh *
14M cache
4.0K    proxy

Proxy cache valid is set to

proxy_cache_valid 200 120d;

I track HIT and MISS via

add_header X-Cache-Status $upstream_cache_status;

Despite these settings I am seeing a lot of MISSes. And this is for pages I intentionally ran a cache warmer an hour ago.

How do I debug why these MISSes are happening? How do I find out if the miss was due to eviction, expiration, some rogue header etc? Does Nginx provide commands for this?

Edit: Full config

    # at http level
    proxy_cache_path  /var/lib/nginx/cache  levels=1:2 inactive=400d keys_zone=staticfilecache:180m  max_size=700m;
    proxy_temp_path /var/lib/nginx/proxy;
    proxy_connect_timeout 30;
    proxy_read_timeout 120;
    proxy_send_timeout 120;
    #prevent header too large errors
    proxy_buffers 256 16k;
    proxy_buffer_size 32k;
    #httpoxy exploit protection
    proxy_set_header Proxy "";

    # at server level 
    add_header Cache-BYPASS-Reason $skip_reason;

    # define nginx variables
    set $do_not_cache 0;
    set $skip_reason "";
    set $bypass 0;

    # security for bypass so localhost can empty cache
    if ($remote_addr ~ "^(127.0.0.1|Web.Server.IP)$") {
        set $bypass $http_8X0;
    }

    # skip caching WordPress cookies
    if ($http_cookie ~* "comment_author_|wordpress_(?!test_cookie)|wp-postpass_" ) {
        set $do_not_cache 1;
        set $skip_reason Cookie;
    }

    # Don't cache URIs containing the following segments
    if ($request_uri ~* "/wp-admin/|/xmlrpc.php|wp-.*.php") {
        set $skip_cache 1;
        set $skip_reason URI;
    }

    # https://guides.wp-bullet.com/how-to-configure-nginx-reverse-proxy-wordpress-cache-apache/
    location / {
            proxy_pass http://127.0.0.1:8000;

            proxy_set_header X-Real-IP  $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto https;
            proxy_set_header X-Forwarded-Port 443;
            proxy_set_header Host $host;
            proxy_set_header Accept-Encoding "";

            # may need to comment out proxy_redirect if get login redirect loop
            proxy_redirect off;

            proxy_cache_key "$scheme://$host$uri";
            add_header X-Nginx-Cache-Head "$scheme://$host$uri";
            proxy_cache staticfilecache;
            proxy_cache_valid       200 301 302 100d;
            proxy_cache_valid 404 1m;


            add_header Cache-Control public;

            proxy_ignore_headers Expires;
            proxy_ignore_headers  "Cache-Control";
            proxy_ignore_headers X-Accel-Expires;

            proxy_hide_header "Cache-Control";
            proxy_hide_header Pragma;
            proxy_hide_header Server;
            proxy_hide_header Request-Context;
            proxy_hide_header X-Powered-By;
            proxy_cache_revalidate on;

            proxy_hide_header X-AspNet-Version;
            proxy_hide_header X-AspNetMvc-Version;
            #proxy_pass_header X-Accel-Expires;


            add_header X-Nginx-Cache-Status $upstream_cache_status;

            proxy_cache_use_stale  error timeout invalid_header updating http_500 http_502 http_503 http_504;
            proxy_cache_bypass $arg_nocache $do_not_cache $http_8X0;
            proxy_no_cache $do_not_cache;

    }

    location ~* \.(jpg|png|gif|jpeg|css|js|mp3|wav|swf|mov|doc|pdf|xls|ppt|docx|pptx|xlsx)$ {
            proxy_cache_valid 200 120d;
            expires 364d;
            add_header Cache-Control public;
            proxy_pass http://127.0.0.1:8000;
            proxy_cache staticfilecache;
            add_header X-Nginx-Cache-Status $upstream_cache_status;
            proxy_cache_use_stale  error timeout invalid_header updating http_500 http_502 http_503 http_504;
    }
Quintin Par
  • 4,373
  • 11
  • 49
  • 72
  • you might want to create a new logging format, using which you should be able to study the behavior of your caching server and investigate it further based on the results yielded. – Corleone May 12 '18 at 16:35
  • @Corleone What should I add to the logs beside the $upstream_cache_status? – Quintin Par May 12 '18 at 17:35
  • Nginx offers powerful debugging methods... Ref: https://nginx.org/en/docs/debugging_log.html – Pothi Kalimuthu May 19 '18 at 06:27
  • @PothiKalimuthu, unfortunately, I can't compile Nginx. – Quintin Par May 19 '18 at 13:49
  • Some operating systems have this compiled and packaged. What OS do you use? – Pothi Kalimuthu May 20 '18 at 01:21
  • Centos. Mine was compiled with http_addition_module http_auth_request_module http_dav_module http_flv_module http_gunzip_module http_gzip_static_module http_mp4_module http_random_index_module http_realip_module http_secure_link_module http_slice_module http_ssl_module http_stub_status_module http_sub_module http_v2_module mail_ssl_module stream_realip_module stream_ssl_module stream_ssl_preread_module – Quintin Par May 20 '18 at 15:40

3 Answers3

8

Caching:

Are you enabling the proxy_cache in your location or server block?

For example, a few settings in the location / block from the Nginx docs.

proxy_cache_path  /var/lib/nginx/cache  levels=1:2   keys_zone=staticfilecache:180m  max_size=700m;


server {
    # ...
    location / {
        proxy_cache my_cache;
        proxy_cache_revalidate on;
        proxy_cache_min_uses 3;
        proxy_cache_use_stale error timeout updating http_500 http_502
                              http_503 http_504;
        proxy_cache_background_update on;
        proxy_cache_lock on;
    # ...
    }

For the cache to work you need at least the two mandatory settings:

If you set it in some location block, are you sure that's the one you want to be caching?


Analyzing

If you wish to analyze the hits, you can create a specific log for that:

log_format cache_st '$remote_addr - $upstream_cache_status [$time_local]  '
                    '"$request" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent"';

And in the same server or location block, you can add it as a secondary log, so you don't miss the other stuff:

access_log   /var/log/nginx/domain.com.access.log;
access_log   /var/log/nginx/domain.com.cache.log cache_st;

You can then check some stats:

HIT vs MISS vs BYPASS vs EXPIRED

awk '{print $3}' cache.log | sort | uniq -c | sort -r

MISS URLs:

awk '($3 ~ /MISS/)' cache.log | awk '{print $7}' | sort | uniq -c | sort -r

BYPASS URLs:

awk '($3 ~ /BYPASS/)' cache.log | awk '{print $7}' | sort | uniq -c | sort -r

MISS vs BYPASS

  • MISS occurs when a pattern is configured to cache but at the time of request was not cached. In correct configuration, subsequent requests will be served from cache based on caching duration other parameters.
  • BYPASS occurs when a pattern was explicitly configured NOT to use cache. e.g. skipping cache for logged in user. Subsequent requests will also be bypassed.

Analyzing source: - https://easyengine.io/tutorials/nginx/upstream-cache-status-in-access-log/

Another option for analyzing on the fly via console is to use GoAccess, a really nice real time web log analyzer, which only needs ncurses to work: https://goaccess.io/

Leo Gallego
  • 1,893
  • 9
  • 17
3

You may need to set the inactive parameter on proxy_cache_path to something greater than 120d (or whatever you want your max cache time to actually be). The default setting for inactive is 10 minutes. So long as the URL you're caching is accessed within the inactive parameter's time frame your cache is valid but if it's not accessed within that time frame it will fall out of cache. See Understanding the nginx proxy_cache_path directive for more information.

I believe this falls outside the typical $upstream_cache_status style debugging because cache cleanup doesn't happen within the request/response cycle. AFAIK an nginx worker process does cache clean up as a low priority task if it's not doing anything else. I'm not sure where this activity would show up in logs but it's likely only going to show up with a debug enabled build.

Mike Howsden
  • 381
  • 3
  • 6
  • 1
    This might just be the reason. I’ve set inactive to 400 days. Thank you for this gem. I’ll report back in a day. – Quintin Par May 22 '18 at 04:51
  • 1
    It looks like the MISSes have come down: https://d.pr/i/3KtaRF but not completely. I also ran a cache warmer. I wonder if there’s some more that I am missing. I’ve also posted the complete config above. – Quintin Par May 22 '18 at 20:54
  • The section about when Nginx decides to cache something here https://www.nginx.com/blog/nginx-caching-guide/ is good. – Mike Howsden May 23 '18 at 02:18
2

What are trying to cache? A cms? A static page? Usually if backed send no-cache , expire -1, or cache private, you will get misses . In case of cookie also you will hit misses.

x86fantini
  • 302
  • 1
  • 3
  • 9
  • 1
    I had set to proxy_ignore_headers Expires; proxy_ignore_headers "Cache-Control"; proxy_ignore_headers X-Accel-Expires; Should I be doing more? – Quintin Par May 22 '18 at 18:15
  • Please post here complete configuration. Also tell us if it's a CMS or custom php. Thx – x86fantini May 22 '18 at 18:39
  • Thanks for responding. I’ve updated the config above. Mine is wordpress website that’s largely static. Does not have commenting. So a post is published the page is static and to update the post I invalidate the cache with a secret header – Quintin Par May 22 '18 at 20:42