35

Md5sum is used to check integrity of iso image of say ubuntu.How does it work?

nik
  • 13,254
  • 3
  • 41
  • 57
TwiggedToday
  • 9,925
  • 7
  • 26
  • 16
  • Most of the answers here explain how MD5 works, not how MD5Sum works. For instance, does it combine file size with every 1st bit of every bite in a file? Does it process straight through an entire file? I was interested in these things and I think the asker was also. Hashing can be done on many different things but how it's applied to files to produce a checksum is something different – Sprague Jul 01 '10 at 19:22
  • 1
    All md5sum does is perform MD5. It might work against several files, still produces a single MD5 for each file. There is absolutely no difference. – defines Jul 30 '13 at 22:48

3 Answers3

40

The algorithm Message Digest 5 or MD5, it is a one-way cryptographic hash function. MD5 performs many binary operations on the "message" (binary data, for example of an iso image) to compute a 128-bit "hash".

It is useful to check the integrity of a downloaded package such as ubuntu because generating the MD5 hash will be identical if the package is exactly the same as the authenticated source. If there are errors in the download (from network interference, etc.) or if you downloaded a malicious package that was made to look like ubuntu but contained other code/whatever, then the MD5 would not match and you would know something was different and you should download again from a different source.

UPDATE

If the above doesn't explain "how it works" quite literally enough for you, here is my own javascript implementation of the MD5 algorithm:

/**
 * JavaScript Class Implementation of RSA Data Security, Inc.
 * Message Digest 5 (MD5) Algorithm, as defined in RFC 1321.
 * 
 * @author    Defines Fineout <defines.fineout@gmail.com>
 * @created   10/10/2009 11:23:29 AM
 * @modified  10/10/2009 12:34:56 PM
 * @version   0.1 WORKING
**/

/**
 * Usage example:
 * var md5 = new MD5();
 * alert(md5.hash('Defines Fineout')); // 2237cdf43f2a6242ad67be9a5242c957
**/
function MD5()
{
  this.F = function(x,y,z) { return (x & y) | ((~x) & z); };
  this.G = function(x,y,z) { return (x & z) | (y & (~z)); };
  this.H = function(x,y,z) { return (x ^ y ^ z); };
  this.I = function(x,y,z) { return (y ^ (x | (~z))); };
  this.C = function(q,a,b,x,s,ac) { return this.addu(this.rol(this.addu(this.addu(a,q),this.addu(x,ac)),s),b); };
  this.FF = function(a,b,c,d,x,s,ac) { return this.C((b & c) | ((~b) & d),a,b,x,s,ac); };
  this.GG = function(a,b,c,d,x,s,ac) { return this.C((b & d) | (c & (~d)),a,b,x,s,ac); };
  this.HH = function(a,b,c,d,x,s,ac) { return this.C(b ^ c ^ d,a,b,x,s,ac); };
  this.II = function(a,b,c,d,x,s,ac) { return this.C(c ^ (b | (~d)),a,b,x,s,ac); };
  
  this.hash = function(message)
  {
    var xl,x=[],k,aa,bb,cc,dd,a=0x67452301,b=0xEFCDAB89,c=0x98BADCFE,d=0x10325476;
    x = this.convertToWordArray(this.utf8Encode(message));
    xl = x.length;
    for (var j = 0; j < xl; j += 16) {
      aa=a; bb=b; cc=c; dd=d;
      a=this.FF(a,b,c,d,x[j+0],7,0xD76AA478);
      d=this.FF(d,a,b,c,x[j+1],12,0xE8C7B756);
      c=this.FF(c,d,a,b,x[j+2],17,0x242070DB);
      b=this.FF(b,c,d,a,x[j+3],22,0xC1BDCEEE);
      a=this.FF(a,b,c,d,x[j+4],7,0xF57C0FAF);
      d=this.FF(d,a,b,c,x[j+5],12,0x4787C62A);
      c=this.FF(c,d,a,b,x[j+6],17,0xA8304613);
      b=this.FF(b,c,d,a,x[j+7],22,0xFD469501);
      a=this.FF(a,b,c,d,x[j+8],7,0x698098D8);
      d=this.FF(d,a,b,c,x[j+9],12,0x8B44F7AF);
      c=this.FF(c,d,a,b,x[j+10],17,0xFFFF5BB1);
      b=this.FF(b,c,d,a,x[j+11],22,0x895CD7BE);
      a=this.FF(a,b,c,d,x[j+12],7,0x6B901122);
      d=this.FF(d,a,b,c,x[j+13],12,0xFD987193);
      c=this.FF(c,d,a,b,x[j+14],17,0xA679438E);
      b=this.FF(b,c,d,a,x[j+15],22,0x49B40821);
      a=this.GG(a,b,c,d,x[j+1],5,0xF61E2562);
      d=this.GG(d,a,b,c,x[j+6],9,0xC040B340);
      c=this.GG(c,d,a,b,x[j+11],14,0x265E5A51);
      b=this.GG(b,c,d,a,x[j+0],20,0xE9B6C7AA);
      a=this.GG(a,b,c,d,x[j+5],5,0xD62F105D);
      d=this.GG(d,a,b,c,x[j+10],9,0x2441453);
      c=this.GG(c,d,a,b,x[j+15],14,0xD8A1E681);
      b=this.GG(b,c,d,a,x[j+4],20,0xE7D3FBC8);
      a=this.GG(a,b,c,d,x[j+9],5,0x21E1CDE6);
      d=this.GG(d,a,b,c,x[j+14],9,0xC33707D6);
      c=this.GG(c,d,a,b,x[j+3],14,0xF4D50D87);
      b=this.GG(b,c,d,a,x[j+8],20,0x455A14ED);
      a=this.GG(a,b,c,d,x[j+13],5,0xA9E3E905);
      d=this.GG(d,a,b,c,x[j+2],9,0xFCEFA3F8);
      c=this.GG(c,d,a,b,x[j+7],14,0x676F02D9);
      b=this.GG(b,c,d,a,x[j+12],20,0x8D2A4C8A);
      a=this.HH(a,b,c,d,x[j+5],4,0xFFFA3942);
      d=this.HH(d,a,b,c,x[j+8],11,0x8771F681);
      c=this.HH(c,d,a,b,x[j+11],16,0x6D9D6122);
      b=this.HH(b,c,d,a,x[j+14],23,0xFDE5380C);
      a=this.HH(a,b,c,d,x[j+1],4,0xA4BEEA44);
      d=this.HH(d,a,b,c,x[j+4],11,0x4BDECFA9);
      c=this.HH(c,d,a,b,x[j+7],16,0xF6BB4B60);
      b=this.HH(b,c,d,a,x[j+10],23,0xBEBFBC70);
      a=this.HH(a,b,c,d,x[j+13],4,0x289B7EC6);
      d=this.HH(d,a,b,c,x[j+0],11,0xEAA127FA);
      c=this.HH(c,d,a,b,x[j+3],16,0xD4EF3085);
      b=this.HH(b,c,d,a,x[j+6],23,0x4881D05);
      a=this.HH(a,b,c,d,x[j+9],4,0xD9D4D039);
      d=this.HH(d,a,b,c,x[j+12],11,0xE6DB99E5);
      c=this.HH(c,d,a,b,x[j+15],16,0x1FA27CF8);
      b=this.HH(b,c,d,a,x[j+2],23,0xC4AC5665);
      a=this.II(a,b,c,d,x[j+0],6,0xF4292244);
      d=this.II(d,a,b,c,x[j+7],10,0x432AFF97);
      c=this.II(c,d,a,b,x[j+14],15,0xAB9423A7);
      b=this.II(b,c,d,a,x[j+5],21,0xFC93A039);
      a=this.II(a,b,c,d,x[j+12],6,0x655B59C3);
      d=this.II(d,a,b,c,x[j+3],10,0x8F0CCC92);
      c=this.II(c,d,a,b,x[j+10],15,0xFFEFF47D);
      b=this.II(b,c,d,a,x[j+1],21,0x85845DD1);
      a=this.II(a,b,c,d,x[j+8],6,0x6FA87E4F);
      d=this.II(d,a,b,c,x[j+15],10,0xFE2CE6E0);
      c=this.II(c,d,a,b,x[j+6],15,0xA3014314);
      b=this.II(b,c,d,a,x[j+13],21,0x4E0811A1);
      a=this.II(a,b,c,d,x[j+4],6,0xF7537E82);
      d=this.II(d,a,b,c,x[j+11],10,0xBD3AF235);
      c=this.II(c,d,a,b,x[j+2],15,0x2AD7D2BB);
      b=this.II(b,c,d,a,x[j+9],21,0xEB86D391);
      a=this.addu(a,aa); b=this.addu(b,bb); c=this.addu(c,cc); d=this.addu(d,dd);
    }
    return (this.wordToHex(a)+this.wordToHex(b)+this.wordToHex(c)+this.wordToHex(d)).toLowerCase();
  };
  
  this.test = function()
  {
    if (this.hash('Dustin Fineout') == '8844be37f4e8b3973b48b95b0c69f0b1') {
      return true;
    }
    return false;
  };
  
  this.addu = function(x, y)
  {
    var ls = (x & 0xFFFF) + (y & 0xFFFF);
    return (((x >> 16) + (y >> 16) + (ls >> 16)) << 16) | (ls & 0xFFFF);
  };
  
  this.rol = function(v, s)
  {
    return (v << s) | (v >>> (32 - s));
  };
  
  this.utf8Encode = function(str)
  {
    return unescape(encodeURIComponent(str));
  };
  
  this.convertToWordArray = function(str)
  {
    var lWordCount;
    var lMessageLength = str.length;
    var lNumberOfWords_temp1=lMessageLength + 8;
    var lNumberOfWords_temp2=(lNumberOfWords_temp1-(lNumberOfWords_temp1 % 64))/64;
    var lNumberOfWords = (lNumberOfWords_temp2+1)*16;
    var lWordArray=new Array(lNumberOfWords-1);
    var lBytePosition = 0;
    var lByteCount = 0;
    while ( lByteCount < lMessageLength ) {
      lWordCount = (lByteCount-(lByteCount % 4))/4;
      lBytePosition = (lByteCount % 4)*8;
      lWordArray[lWordCount] = (lWordArray[lWordCount] | (str.charCodeAt(lByteCount)<<lBytePosition));
      lByteCount++;
    }
    lWordCount = (lByteCount-(lByteCount % 4))/4;
    lBytePosition = (lByteCount % 4)*8;
    lWordArray[lWordCount] = lWordArray[lWordCount] | (0x80<<lBytePosition);
    lWordArray[lNumberOfWords-2] = lMessageLength<<3;
    lWordArray[lNumberOfWords-1] = lMessageLength>>>29;
    return lWordArray;
  };
  
  this.wordToHex = function(lValue)
  {
    var wordToHexValue="",wordToHexValue_temp="",lByte,lCount;
    for (lCount = 0;lCount<=3;lCount++) {
      lByte = (lValue>>>(lCount*8)) & 255;
      wordToHexValue_temp = "0" + lByte.toString(16);
      wordToHexValue = wordToHexValue + wordToHexValue_temp.substr(wordToHexValue_temp.length-2,2);
    }
    return wordToHexValue;
  };
}

I'll be happy to annotate or expand as desired :) Also, please note this isn't the singleton pattern I still use for JavaScript namespaces - it was actually my first attempt at such a pattern. There are many delicious singleton patterns available from Douglas Crockford.

defines
  • 10,229
  • 4
  • 40
  • 56
  • 3
    Just a little added tidbit: The MD5 algorithm (and indeed most cryptographic hash function algorithms) are constructed so that even one bit of difference between two inputs will cause a huge change in the hash function output. This property makes it very difficult to create two input messages that hash to the same output. – Mike Daniels Jun 15 '09 at 17:41
  • 8
    Does this really answer the question? It doesn't seem to explain how MD5 works – MiJyn Jul 30 '13 at 07:57
  • Actually yes it does in the first paragraph - MD5 performs many binary operations on the "message" (binary data, for example of an iso image) to compute a 128-bit "hash". But, if you need a more literal answer, let me dig up my javascript implementation and post it. You can interpret the code for yourself. There isn't much interesting going on :) – defines Jul 30 '13 at 20:54
  • Is there any vulnerabilities in this, assuming that it's client-side code? – Daniel Node.js May 03 '14 at 18:41
  • 1
    Of course there are, as with any code, and depending on your use case and context :) I provide no warranty; this is a simple implementation of the RFC standard for demonstration and educational purposes only. – defines May 04 '14 at 14:33
  • Did you just write this? Wow. – JBis Jul 09 '18 at 14:02
  • @JBis honestly I don't remember anymore. I might have implemented from reading the standard, but it was probably a simple port from another public implementation from c/python/php. Nothing groundbreaking here, but thanks! :) – defines Jul 09 '18 at 17:54
  • The answer did not address the question properly, it did not explain the "how", that is the algorithmic thinking of forming MD5 sum. This answer provided a purportedly valid implementation of MD5 sum which still left the original question unanswered. – Yiling Apr 20 '21 at 04:23
  • Based on comments over the years asking how one-way hashing works as a security mechanism more generally than my original answer, I'm going to add an update with such a discussion. Thanks for the prods! – defines Jun 14 '21 at 16:04
1

MD5Sum is a file checksum generating tool using MD5 as the hashing algorithm. It provides the user with a reasonable assurance that the file was untampered with.

In order to do this, the user will need to (via shell integration or manually) process their own MD5 hash and compare it to the hash provided by the uploader.

An MD5Sum hash won't assure you that the uploader who provided the checksum didn't fiddle with the file, to do that you should trust the source of the MD5 checksum (ie. get the checksum from Ubuntu instead of the download site providing the image).

Sprague
  • 1,610
  • 10
  • 22
0

md5sum is a tool used to verify the integrity of files, as virtually any change to a file will cause its MD5 hash to change.

Most commonly, md5sum is used to verify that a file has not changed as a result of a faulty file transfer, a disk error or non-malicious meddling. The md5sum program is included in most Unix-like operating systems or compatibility layers such as Cygwin.

You will need to run md5sum on the ISO image and it will output something as below:

Open terminal and execute the following:

md5sum myisofile.iso > out.txt

cat out.txt (this will output as below)

myisofile.iso   f6de8b885f7a1917e5368fcb947d1738

This MD5 hash need to be compared with the source myisofile.iso. If both MD5 hash is the same, means the ISO is untampered.

Du-Lacoste
  • 11,530
  • 2
  • 71
  • 51