1

I'm curious if there is a way rewrite a URL if response is a 404, in Varnish 2.1.5?

For example. I'd like to pull up a URL, which may or may not exist. If the URL doesn't exist, I'd like to do a URL rewrite and try the new URL instead.

I'm new to Varnish and don't completely understand the lifecycle of a request (if anyone knows a guy of article explaining this, please share).

I've tried setting some variables and request headers, and checking res.status but they seem to get lost someplace in the lifecycle and the page 404s anyways:

if (req.http.cookie ~ "lang_pref"
      && resp.status == 404
      && req.url ~ "^\/(en|fr|de)(\/.*)?$"
      ) {
    set resp.http.Location = "https://" req.http.host regsub(req.url, "^\/(en|fr|de)\/","/");
  }

The use case is for a translated site. Example

Website.com/french/page may or may not exist If /French/page responds with a 404 Then try /page instead If /page doesn't exist Then 404

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
Vinnie James
  • 5,763
  • 6
  • 43
  • 52
  • You may want to reconsider using 404 redirects as such because search engines may view that as deceptive and punish the site. A good practice is to give the user options to click as they choose. Now if you were to use a 301 redirect that will be ok as that is the purpose of a 301. – norcal johnny Jun 20 '17 at 02:12
  • That's a good point. How would I tell the 404 that it should be a 301 instead? – Vinnie James Jun 20 '17 at 02:30
  • If you keep in mind the purpose of a 301 or a 404 and when to use it for the benefit of the visitor, than you will make the best decision as search engines look for the best result for a visitor. A 301 says a page has moved and here is the new page it permanently moved to. Where a 404 says the page is non-existent or temporarily not available. In your case a 404 is perfect in describing the resulting page content. And at that point give them a few page links with closely related content and that will please the visitor and search engines. 404 pages are not punished but forced redirects are. – norcal johnny Jun 20 '17 at 02:45
  • Yes, I understand that. What I'm asking is how, do I tell Varnish to respond with a 301, instead of a 404 when the page doesn't exist? – Vinnie James Jun 20 '17 at 02:47

2 Answers2

2

I am writing this as an answer or it will look goofy as a comment. Keep in mind that 301 is permanently moved and 302 is temporarily moved.

You can also adjust to use regex as you did in your post.

sub vcl_recv {
    if (req.url ~ “^/old/page”) {
        return (synth(301, “/new/page”));
    }
    if (req.url ~ “^/this/oldPage”) {
        return (synth(302, “/this/newPage”));
    }
}

sub vcl_synth {
    if (resp.status == 301 || resp.status == 302) {
        set resp.http.location = resp.reason;
        set resp.reason = "Moved";
        return (deliver);
    }
}

UPDATE: To address comments.

sub vcl_error { 
  if (obj.status == 404) { 
  set obj.status = 301; 
  set obj.http.Location = obj.response; 
  return (deliver);
norcal johnny
  • 2,050
  • 2
  • 13
  • 17
  • That helps a bit, but I would prefer to not keep a list of which files do and don't exist. Any way for the server backend to see the 404, check the URL and reply back to the client with 301 and the rewritten URL? – Vinnie James Jun 20 '17 at 03:13
  • Maybe something like `if (beresp.status == 404) {maybe give client a 302 instead}` – Vinnie James Jun 20 '17 at 03:18
  • 1
    You can use . sub vcl_error { if (obj.status == 404) { set obj.status = 301; set obj.http.Location = obj.response; return (deliver); – norcal johnny Jun 20 '17 at 03:24
  • Glad it worked out :) If you do not mind selecting this as the correct answer that would be great or please advise if there is more to call this one complete. Cheers – norcal johnny Jun 20 '17 at 03:37
  • 1
    It looks like `synth()` is not valid syntax in 2.1.5, is there an alternative? Also the `sub vcl_error { if (obj.status == 404)` seems to be getting ignored and Apache still serves the default 404 page – Vinnie James Jun 20 '17 at 20:27
2

Here is what ended up working for my use case:

sub vcl_fetch {
...
  if (beresp.status == 404 && req.url ~ "^\/(en|fr|de)(\/.*)?$") {
    error 494;
  }

  return(deliver);
}

sub vcl_error {
  # Double check i18n pages for English before 404
  if (obj.status == 494) {
      set obj.http.Location = "https://" req.http.host regsub(req.url, "^\/(en|fr|de)\/","/") "?notranslation";
      set obj.status = 302;
      return(restart);
  }
...
}
Vinnie James
  • 5,763
  • 6
  • 43
  • 52