1

I'm writing my own simple MD5 implementation in Swift. I'm using it to generate a WEP key. (To generate a 128 bit WEP key you can use an MD5 hash function. You extend the passhphrase to a length of 64 bytes by concatenating it with itself and then running it through the MD5 hash algorithm.) I understand that WEP is insecure, but the constraint is that I have to generate a WEP encrypted key from a passphrase. I don't get to pick to use WPA2 instead unfortunately.

Here is my implementation below. I think I have a minor bug somewhere that is throwing off the whole thing. If I attempt to generate an md5 hash from a given string, I do not generate the correct hash.

Anyone know where my mistake is? Thanks for your time.

Notes on code below:
- I created an operator to perform circular shifts during the main loop of the md5 algorithm.
- I used an index variable j to keep track of the variables in {a,b,c,d} that you pass into the function in each sub-round of the main loop. (The alternative would be to swap the value stored in each variable after each iteration.)

// MD5 Implementation
/** Circular Shift implementation for 32 bit unsigned integers **/
protocol CircularShiftable {
    func leftCircularShift(value: UInt32, amount: UInt32) -> UInt32
}
extension UInt32: CircularShiftable {
    func leftCircularShift(value: UInt32, amount: UInt32) -> UInt32 {
        print(String((value << UInt32(amount)), radix: 2))
        print(String((value >> (UInt32(32 - amount))), radix: 2))
        return (amount == 0) ? value : ((value << UInt32(amount)) | (value >> (UInt32(32 - amount))))
    }
}
infix operator <<< { associativity left precedence 160 }
func <<< <T: CircularShiftable>(lhs: T, rhs: UInt32) -> UInt32 {
    return lhs.leftCircularShift(lhs as! UInt32, amount: rhs)
}

    // t[i] indicates the integer part of 2^32*abs(sin(i)) where i is in radians
    private let t: [UInt32] =
        // Round 1
       [0xd76aa478,0xe8c7b756,0x242070db,0xc1bdceee,
    0xf57c0faf,0x4787c62a,0xa8304613,0xfd469501,
    0x698098d8,0x8b44f7af,0xffff5bb1,0x895cd7be,
    0x6b901122,0xfd987193,0xa679438e,0x49b40821,
    // Round 2
    0xf61e2562,0xc040b340,0x265e5a51,0xe9b6c7aa,
    0xd62f105d,0x2441453,0xd8a1e681,0xe7d3fbc8,
    0x21e1cde6,0xc33707d6,0xf4d50d87,0x455a14ed,
    0xa9e3e905,0xfcefa3f8,0x676f02d9,0x8d2a4c8a,
    // Round 3
    0xfffa3942,0x8771f681,0x6d9d6122,0xfde5380c,
    0xa4beea44,0x4bdecfa9,0xf6bb4b60,0xbebfbc70,
    0x289b7ec6,0xeaa127fa,0xd4ef3085,0x4881d05,
    0xd9d4d039,0xe6db99e5,0x1fa27cf8,0xc4ac5665,
    // Round 4
    0xf4292244,0x432aff97,0xab9423a7,0xfc93a039,
    0x655b59c3,0x8f0ccc92,0xffeff47d,0x85845dd1,
    0x6fa87e4f,0xfe2ce6e0,0xa3014314,0x4e0811a1,
    0xf7537e82,0xbd3af235,0x2ad7d2bb,0xeb86d391]

// Each round uses a sequence of shift amounts in each operation
//       Operation number: 1  2   3   4   5  6   7   8   9  10  11  12  13 14  15  16
private let s: [UInt32] = [7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22,
                           5, 9,  14, 20, 5, 9,  14, 20, 5, 9,  14, 20, 5, 9,  14, 20,
                           4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23,
                           6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21]

// Each of the four rounds uses a specific nonlinear function in the set S = { F, G, H, I }
let F = { (X: UInt32, Y: UInt32, Z: UInt32) -> UInt32 in
    return (X & Y) | ((~X) & Z)
}
let G = { (X: UInt32, Y: UInt32, Z: UInt32) -> UInt32 in
    return (X & Y) | (Y & (~Z))
}
let H = { (X: UInt32, Y: UInt32, Z: UInt32) -> UInt32 in
    return X ^ Y ^ Z
}
let I = { (X: UInt32, Y: UInt32, Z: UInt32) -> UInt32 in
    return Y ^ (X | (~Z))
}

func wep_encrypt_128(passphrase: String) -> String {
    // Pad message length to 64 bytes in length & convert to 32 bit unsigned integer array
    let msg = pad_msg(passphrase)

    // For each chunk of 512 in resulting msg, process in the main loop of the algorithm
    let blocks = chunk_msg(msg)
    let md5_key = md5_main_loop(blocks)

    // Format the resulting md5 generated key
    // Testing key: FAE93BD77058E8EDFDE8BE5C39
    return (md5_key.reduce(String("")) { (key, piece) -> String in
        return key + String(format: "%2X", piece)
    })
}

func chunk_msg(msg: [UInt8]) -> [[UInt8]] {
    var blocks: [[UInt8]] = []
    for (var i = 0; i < msg.count/64; i++) {
       blocks.append([UInt8](msg[64*i..<(64*i+64)]))
    }
    return blocks
}

func pad_msg(passphrase: String) -> [UInt8] {
    // For WEP you take the initial passphrase and add it to itself until you end up at 64 bytes in length
    var msg = passphrase
    for (var i = 0; msg.characters.count < 64; i = (i+1) % passphrase.characters.count) {
        msg.append(passphrase[passphrase.startIndex.advancedBy(i)])
    }

    // For MD5 you want a msg padded such that the length is 64 bits shy of a multiple of 512 (1 bit, then as many 0s as needed)
    var unicode_byte_values: [UInt8] = msg.utf8.map({$0 as UInt8})
    unicode_byte_values.append(0x80)    // MD5 uses big endian for encoding of bits into bytes
    // Note: (x % 64) == 56 is true when x is 8 bytes less than a multiple of 512 bits
    while ((unicode_byte_values.count % 64) != 56) {
        unicode_byte_values.append(0x00)
    }

    // Append little endian representation of the pre-MD5 padded String length (should always be 64 bytes for WEP encryption)
    //64 * 8 = 512 bits = 00000000 00000000 00000000 00000000 00000000 00000000 00000010 00000000 in big endian...
    //64 * 8 = 512 bits = 00000000 10000000 00000000 00000000 00000000 00000000 00000000 00000000 in little endian...
    var length = [UInt8](count: 8, repeatedValue: 0)
    length[length.endIndex.advancedBy(-7)] = 0x80
    length = length.reverse()
    unicode_byte_values += length

    return unicode_byte_values
}

// Convert each block of 4 bytes to a 32 bit word (using little endian notation)
func convert_to_sub_blocks(b: [UInt8]) -> [UInt32] {
    var m_block: [UInt32] = []
    for (var i = 0; i < b.count; i+=4) {
        let BLK_A = UInt32(b[b.startIndex.advancedBy(i)])
        let BLK_B = UInt32(b[b.startIndex.advancedBy(i+1)]) << 8
        let BLK_C = UInt32(b[b.startIndex.advancedBy(i+2)]) << 16
        let BLK_D = UInt32(b[b.startIndex.advancedBy(i+3)]) << 24
        m_block.append(UInt32(BLK_A | BLK_B | BLK_C | BLK_D))
    }
    return m_block
}

func md5_main_loop(blocks: [[UInt8]]) -> [UInt8] {
    // Chaining variables for the MD5 algorithm
    var A: UInt32 = 0x67452301
    var B: UInt32 = 0xEFCDAB89
    var C: UInt32 = 0x98BADCFE
    var D: UInt32 = 0x10325476

    // FF(a,b,c,d,Mj,s,ti) => a = b + ((a + F(b,c,d) + Mj + ti) <<< s)
    var M = [UInt32]()
    for (var i = 0; i < blocks.count; i++) {

        // There are 16 32-bit message sub-blocks per each 512-bit block of input text
        M = convert_to_sub_blocks(blocks[i])

        // Initialize vars for round
        var round_vars: [UInt32] = [A, B, C, D]
        var j = 0

        // Each round has 16 operations, thus 4 * 16 = 64 operations
        for (var l = 0; l < 64; l++) {

            // Set indexing relative to j
            let a = j
            let b = (a == 3) ? 0 : j+1
            let c = (b == 3) ? 0 : b+1
            let d = (c == 3) ? 0 : c+1

            // Set operation to apply and index into message structure to access based off round position
            var op: (UInt32, UInt32, UInt32) -> UInt32
            var m_i: Int
            if (l < 16) {       // Round 1
                op = F
                m_i = l
            }
            else if (l < 32) {  // Round 2
                op = G
                m_i = (5 * l + 1) % 16
            }
            else if (l < 48) {  // Round 3
                op = H
                m_i = (3 * l + 5) % 16
            }
            else {              // Round 4
                op = I
                m_i = (7 * l) % 16
            }

            // Compute value for variable for sub-round
            round_vars[a] = round_vars[b] &+ (round_vars[a] &+ op(round_vars[b], round_vars[c], round_vars[d]) &+ M[m_i] &+ t[l] <<< s[l])

            // Shift index for cycling through round_vars
            if ((l % 16 == 0) && (l != 0)) { j = 0 }
            else { j = (j == 0) ? 3 : --j }
        }

        // Update chaining variables for next block of data
        A = A &+ round_vars[0]
        B = B &+ round_vars[1]
        C = C &+ round_vars[2]
        D = D &+ round_vars[3]
    }

    return form_hash([A,B,C,D])
}

// Forming the hash requires concatenating the final values stored in A,B,C,D
func form_hash(sub_blocks: [UInt32]) -> [UInt8] {
    var key_hash: [UInt8] = []
    for b in sub_blocks {
        key_hash += [UInt8((b      )  & 0xFF)]
        key_hash += [UInt8((b >>  8)  & 0xFF)]
        key_hash += [UInt8((b >> 16) & 0xFF)]
        key_hash += [UInt8((b >> 24) & 0xFF)]
    }
    return key_hash
}

// Testing key: FAE93BD77058E8EDFDE8BE5C39
wep_encrypt_128(psk)
  • `I think I have a minor bug somewhere` Ok but why? What should be the expected result compared to the result you get? Thanks. – Eric Aya Jan 12 '16 at 13:39
  • @EricD. I have used a few different online MD5 generators to compute the MD5 hash of a given string, and they all agree with each other... but they disagree with my code. For example if I want to use the string 'tomatojuice' it gets extended to 64 characters in length as 'tomatojuicetomatojuicetomatojuicetomatojuicetomatojuicetomatojui'. If you run that through an md5 hash generation function you get '40c1a019290ac90ab56bb81f2c8c99ea' and my implementation above gets '7D8A33B1C783DD4B FB189AB6A906D3E'. I'm not generating the proper md5 hash. – crispy_bacon Jan 12 '16 at 13:43
  • Ok, wasn't sure about what you meant. Thanks. – Eric Aya Jan 12 '16 at 13:44
  • No problem. I will update the question to make that clear. – crispy_bacon Jan 12 '16 at 13:45
  • For a MD5 Swift implementation see [MD5.swift](https://github.com/krzyzanowskim/CryptoSwift/blob/master/Sources/CryptoSwift/MD5.swift) by Marcin Krzyzanowski. – zaph Jan 12 '16 at 14:26
  • Interesting, his is pretty similar in some ways. Thanks for the link! Anyone else have ideas? – crispy_bacon Jan 12 '16 at 14:30

1 Answers1

0

I noticed a few typos here and there from putting this thing together:

  • My length was calculated wrong. I should have appended a two in the 2nd byte from the end, and then reversed it. (I thoroughly confused myself in writing up the question with endianness and byte swapping, so it was just way off in the original implementation. I noticed it when I did some of the steps for the algorithm out on paper.)
  • G was typed incorrect: it should read (X & Z) | (Y & (~Z)), where I had accidentally put a Y in place of the bold typed Z

Thanks for your help everyone! It works now :)