5

App loads user's text files and each of them can be changed by user. On app's start I want to check if any file was changed since last time. I think the most efficient way is to calculate checksum of each file and save to one json file. On app's start I will check each file checksum and compare it to data from json file Is there any more optimal/efficient way of doing this ? Or how exactly calculate file checksum ?

Piotr Wu
  • 1,362
  • 3
  • 14
  • 31

2 Answers2

8

I believe Using fs.stat and checking 'last modified' is much faster than reading the whole file and comparing checksum as it is only metadata (and you don't actually read the whole file).

Also, If your files are located within different directories you can test if a folder was changed. this can reduce I/O calls (in case the last modified date didn't change you can skip checking the files on that folder).

You will have to store the 'last modified' date, I would use Redis for this. you will need to update it on every modification change and on the first run of course.

here is a function (and a function call) to test if a file or folder was changed:

let fs = require('fs');
let moment = require('moment');

let path = 'views'; //your folder path (views is an example folder)
wasFileChanged(path, (err,wasChanged) => {
  if (wasChanged){
    console.log('folder was changed, need to compare files');
    //need to update redis here
    //...comapre files to find what was changed
  }
  else{
    console.log('folder was not changed');
  }
});

/**
 * Checks if a file/folder was changed 
 */
function wasFileChanged(path, callback) {
  fs.open(path, 'r', (err, fd) => {
    if (err) {
      return callback (err);
    } else {

      //obtain previous modified date of the folder (I would use redis to store/retrieve this data)
      let lastModifed = '2016-12-03T00:41:12Z'; //put the string value here, this is just example

      fs.stat(path, (err, data) => {
        console.log('check if file/folder last modified date, was it after my last check ');

        //I use moment module to compare dates
        let previousLMM = moment(lastModifed);
        let folderLMM = moment(data.mtime.toISOString());
        let res = !(folderLMM.isSame(previousLMM, 'second')); //seconds granularity
        return callback (null, res);
      });
    }
  });
}
Rayee Roded
  • 2,440
  • 1
  • 20
  • 21
  • Yes, that's an idea. I knew there should be something which doesn't force me to read each file, and 'lastModifies' fits very well for my purposes. I would arque with you about that code, but idea is enough for me :p Thanks – Piotr Wu Dec 03 '16 at 13:21
  • how can i test if a folder was changed? mtime of a folder doesn't change if the content of a file inside it is modifed: https://stackoverflow.com/questions/3620684/directory-last-modified-date – Sid Vishnoi Jun 14 '17 at 10:03
  • 2
    @SidVishnoi To check if a folder was changed you can first check mtime (which will be enough if files/folders were added/removed), then use fs to get the list of files and check if any existing file was changed. if mtime AND all existing files were not changed then the folder wasn't. – Rayee Roded Jun 14 '17 at 18:45
  • or we can create a hash of mtime of all child files and compare.. this is what i was thinking. any more efficient way? – Sid Vishnoi Jun 16 '17 at 07:39
  • Is `last modified` the same across different systems? If the file gets downloaded, wouldn't `last modified` date change to the time it was downloaded? – Pants Jan 22 '21 at 02:59
2

Seems like this blog is a good read for you: http://blog.tompawlak.org/calculate-checksum-hash-nodejs-javascript

code example (from the blog):

var crypto = require('crypto');

function checksum (str, algorithm, encoding) {
    return crypto
        .createHash(algorithm || 'md5')
        .update(str, 'utf8')
        .digest(encoding || 'hex')
}

To read from a file and present its hash:

fs.readFile('test.dat', function (err, data) {
    checksum(data);         // e53815e8c095e270c6560be1bb76a65d
    checksum(data, 'sha1'); // cd5855be428295a3cc1793d6e80ce47562d23def
});

Comparing checksum to find if a file was changed is valid and you can also compare when file was last modified by using fs.stat

Rayee Roded
  • 2,440
  • 1
  • 20
  • 21
  • I saw that but I am not sure about this. It will load a lot of files, and really reading all of them seems not to be the best option. I am afraid of efficiency of this. Maybe there is something more appropriate for my case – Piotr Wu Dec 02 '16 at 23:31
  • @PiotrWójcik node.js may have [crypto](https://nodejs.org/api/crypto.html) support built in. Checking for its availability is the second bullet in the link. – traktor Dec 03 '16 at 00:28