0

I am trying to code in Hamming code(python language) on data present in a .txt file. How can I go about it? DO I need to read out the data line by line, convert into ascii characters and then compute hamming code on it. Or is there any function or library in python that can serve as a window and can operate on the whole file as one?

Your reply is very appreciated. Than you in advance.

Edit: The scenario is a client server architecture. Client tries to upload a file to server after computing hamming code of the data and stores it in server. Later when it tries to retrieve back the file, it checks the hamming code and detects any error that may have occurred.

kate
  • 113
  • 1
  • 5
  • 17
  • There are many forms of Hamming codes (http://en.wikipedia.org/wiki/Hamming_code). What are you trying to accomplish? A common approach may be to apply a 7,4 hamming code to each nybble of each byte in the file, which would produce a new file almost twice the size. But, even with a 7,4 code, it's up to you how to represent the encoded data. – Neil Nov 26 '13 at 05:34
  • Yeah, I would try (7,4) hamming code on the data. My focus here is fault tolerance so I wouldn't mind doubling the file size. My question is if the reading out the data is unavoidable or if there is any library function in python that can serve my purpose. Thanks @Neil – kate Nov 26 '13 at 05:43
  • You can mmap the file – John La Rooy Nov 26 '13 at 05:46
  • I don't know of any library for it, but for a 7,4 code, just use a dict to map bytes in your text file to their hamming encoded forms. Only 256 entries in the dict, and the encoded forms only need to be computed once. Easy to write, and very fast to run. – Neil Nov 26 '13 at 05:53
  • i am sorry but i am new to python and find it little difficult to follow your tip..can you be more explicit..thanks @Neil – kate Nov 26 '13 at 06:02
  • could you please elaborate on that @gnibber. I tried to google but couldnt quite understand how I can relate to my current problem here. Thanks – kate Nov 27 '13 at 06:04

1 Answers1

1

Using a mapping:

# create a dict that maps input bytes to their hamming-encoded version.  This
# can be pre-calculated and hard-coded, or generated at startup
hamming = {
    0x00: 0x0000, # these numbers are nonsense.  Input byte 0x00 is
                  # being mapped to output bytes 0x0000
    0x01: 0x0101,
    ...
    0xff: 0x10cf
}

# read the source binary file
with open('input.bin', 'r') as infile:
    data = [int(x) for x in infile.read()]

# translate the input data into its encoded form (1 byte becomes 2 with parity added, etc)
output = ''
for byte in data:
    encoded = hamming[byte]
    output += chr((encoded >> 8) & 0xff)
    output += chr((encoded >> 0) & 0xff)

# write the encoded data to a file
with open('output.bin', 'w') as out:    
    out.write(output)

Apart from any bugs and inefficiencies here, it's just up to you to define the 256 entries in the dict hamming.

Neil
  • 2,378
  • 1
  • 20
  • 28