3

I'm developing a web application that manages a large amount of images, stores and resizes them.

the request of an image is something like: domain:port/image_id/size

The server takes the image_id and if there isn't yet an image of such size it creates it and stores it on filesystem.

So everything is ok and the server is running but I need to cache those images in browser for at least one day to reduce the server bandwidth consumption.

I did several tests but nothing seems to work.

Here is the code I use to make the response header:

response.writeHead(304, {
          "Pragma": "public",
          "Cache-Control": "max-age=86400",
          "Expires": new Date(Date.now() + 86400000).toUTCString(),
          "Content-Type": contentType});
    response.write(data);
    response.end();

I also tried with response status 200. contentType is always a mime type like "image/jpg" or "image/png" data is the bytes buffer of the image.

Any advice? Thanks a lot.

live long and prosper,

d.

Daniele P.
  • 228
  • 4
  • 16
  • 1
    RFC2616 10.3.5: If the client has performed a conditional GET request and access is allowed, but the document has not been modified, the server SHOULD respond with this status code. The 304 response MUST NOT contain a message-body [...]. are you sending the 304 in response to GET with If-Modified-Since/If-None-Match? why do you `response.write(data)` when the 304 says "you don't need to retrieve again, use previously fetched data"? – just somebody Mar 05 '14 at 15:17
  • You're right. With 304 I get the first request working while the second, as you pointed, returns an empty body. That's just a test, I used also the code 200. I cannot store the image in the browser cache in either ways – Daniele P. Mar 05 '14 at 15:29
  • i don't know what "working" means, or what "first" and "second" requests you're talking about. but it never hurts to actually read the manual, and be conservative. i'll ignore the broken 304 in the question and assume 200, with the rest of the code intact. http://tools.ietf.org/html/rfc2616#section-14.32 defines only "no-cache" for Pragma. it may be that your browser sees the Pragma header and decides the response is not cacheable. – just somebody Mar 05 '14 at 15:53
  • anyway, can you provide a capture of the communication between the browser and the server? are you sure you don't have "ignore caches" checked in firebug or similar? are you sure the subsequent requests come *without* the If-Modified-Since / If-None-Match headers? – just somebody Mar 05 '14 at 15:59
  • I'm using the chrome developer tools and I'm sure the cache is not ignored because I did a similar thing in php. – Daniele P. Mar 05 '14 at 16:14
  • modified code: `response.writeHead(200, { "Cache-Control": "public, max-age=86400", "Expires": new Date(Date.now() + 86400000).toUTCString(), "Keep-Alive": "timeout=5, max=100", "Content-Type": contentType, "content-length": data.length}); response.write(data); response.end(); ` – Daniele P. Mar 05 '14 at 16:15
  • here is the response header: `HTTP/1.1 200 OK Cache-Control: public, max-age=86400 Expires: Thu, 06 Mar 2014 16:04:59 GMT Keep-Alive: timeout=5, max=100 Content-Type: image/jpg content-length: 10244 Date: Wed, 05 Mar 2014 16:04:59 GMT Connection: keep-alive` – Daniele P. Mar 05 '14 at 16:17
  • Why are you reinventing the wheel? If your files are on disk, there are plenty of modules that handle serving files from disk with proper support for caching and conditional GETs. Connect/Express' `static` comes to mind. – josh3736 Mar 05 '14 at 16:31
  • @DanieleP. what are the *request* headers of subsequent requests? – just somebody Mar 06 '14 at 10:31

1 Answers1

9

I did a lot of tests and I came out with a solution that seems pretty good to manage this caching problem.

Basically what I do is getting the request and check for the request header named "if-modified-since". If I find it and the value (it is a date) is the same as the modified date of the file, the response will be a 304 status with no content. If I don't find this value or it's different from the modified date of the file, I send the complete response with status 200 and the header parameter for further access by the browser.

Here is the complete code of the working test I did:

with "working" I mean that the first request get the file from the server while the next requests get a 304 response and don't send content to the browser, that load it from local cache.

var http    = require("http");
var url     = require("url");
var fs      = require('fs');

function onRequest(request, response) {
    var pathName = url.parse(request.url).pathname;

    if (pathName!="/favicon.ico") {
        responseAction(pathName, request, response);
    } else {
        response.end();
    }
}


function responseAction(pathName, request, response) {
    console.log(pathName);

    //Get the image from filesystem
    var img = fs.readFileSync("/var/www/radar.jpg");

   //Get some info about the file
   var stats = fs.statSync("/var/www/radar.jpg");
   var mtime = stats.mtime;
   var size = stats.size;

   //Get the if-modified-since header from the request
   var reqModDate = request.headers["if-modified-since"];

   //check if if-modified-since header is the same as the mtime of the file 
   if (reqModDate!=null) {
       reqModDate = new Date(reqModDate);
           if(reqModDate.getTime()==mtime.getTime()) {
               //Yes: then send a 304 header without image data (will be loaded by cache)
               console.log("load from cache");
               response.writeHead(304, {
                   "Last-Modified": mtime.toUTCString()
               });

               response.end();
               return true;
        }
    } else {
        //NO: then send the headers and the image
        console.log("no cache");
        response.writeHead(200, {
            "Content-Type": "image/jpg",
            "Last-Modified": mtime.toUTCString(),
            "Content-Length": size
        });

        response.write(img);
        response.end();
        return true;
    }

    //IF WE ARE HERE, THERE IS A PROBLEM...
    response.writeHead(200, {
        "Content-Type": "text/plain",
    });

    response.write("ERROR");
    response.end();
    return false;
}

http.createServer(onRequest).listen(8889);
console.log("Server has started.");

Of course, I don't want to reinvent the wheel, this is a benchmark for a more complex server previously developed in php and this script is a sort of "porting" of this PHP code:

http://us.php.net/manual/en/function.header.php#61903

I hope this will help!

Please, if you find any errors or anything that could be improved let me know!

Thanks a lot, Daniele

Daniele P.
  • 228
  • 4
  • 16
  • 2
    Tip: One shouldn't use synchronous methods such as statSync in production. Use asynchronous equivalents instead. – Sam Apr 18 '15 at 01:19
  • @Sam I think this in particular a case sync methods were meant for. – Tomáš Zato Oct 17 '15 at 18:13
  • I disagree, Async should be used whenever possible(in production), even for the simpliest things, that is the very reason nodejs is able to allocate events to other tasks at the same time. the power of "event based" is based on those async callbacks(aka events). I am not sure if statSync is using an external API, nevertheless, the very wait for "statSync" to respond will block, even if neglible it can add up. – vasilevich Nov 29 '16 at 06:48
  • Looks like mtime will be accurate to greater than a second but the if modified since headers are only accurate to the nearest second... – Drew Freyling Oct 13 '17 at 05:21