-1

I'm working on Huffman Encoding and Decoding. I have encoded a string into binary using Huffman Algorithm and now i want to send it over to another computer over sockets using Python 3 where the encoded data will be decoded back. What would be the most efficient way of doing so ?

Encoder part of code :

import heapq
import socket

class HuffmanEncoder:
    output = {};
    class Node:
        def __init__(self,data,freq,left=None,right=None):
            self.data = data
            self.freq = freq
            self.left = left
            self.right = right

    def __init__(self,root):
        self.root = root

    def isLeaf(root):
        return not root.left and not root.right

    def buildHuffman(p):
        while len(p) != 1:
            left = heapq.heappop(p)[1]
            right = heapq.heappop(p)[1]
            top = HuffmanEncoder.Node('$',left.freq + right.freq)
            top.left = left
            top.right = right
            heapq.heappush(p,(top.freq,top))
        return heapq.heappop(p)[1]

    def printCodes(root,arr,top):
        if root.left:
            arr.insert(top,'0')
            HuffmanEncoder.printCodes(root.left,arr,top + 1)

        if root.right:
            arr.insert(top,'1')
            HuffmanEncoder.printCodes(root.right,arr,top + 1)

        if HuffmanEncoder.isLeaf(root):
            s = ""
            for i in range(0,top):
                s += arr[i]
            HuffmanEncoder.output[root.data] = s
        return HuffmanEncoder.output

def main():
    p = []
    arr = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z',' ']
    freq = [8.167,1.492,2.782,4.253,12.702,2.228,2.015,6.094,6.966,0.153,0.772,4.025,2.406,6.749,7.507,1.929,0.095,5.987,6.327,9.056,2.758,0.978,2.360,0.150,1.974,0.074,25.422]
    for i in range(0,len(arr)):
        x = HuffmanEncoder.Node(arr[i],freq[i])
        heapq.heappush(p,(x.freq,x))

    root = HuffmanEncoder.buildHuffman(p)
    arr = []
    top = 0
    codes = HuffmanEncoder.printCodes(root,arr,top)
    for key in sorted(codes):
        print(key,codes[key])
    s = input()
    for i in range(0,len(s)):
        print(codes[s[i]])

if __name__ == '__main__':
                 main()
the_blank
  • 9
  • 1
  • 5
  • Possibly have a look at zeroMQ bindings for python: http://zeromq.org/bindings:python – kezzos Nov 23 '16 at 12:04
  • Can you indent your code for us? – moopet Nov 25 '16 at 08:59
  • yeah of course sry i forgot. – the_blank Nov 26 '16 at 05:23
  • I'm not sure I understand your question. Are you asking for help with your client (which you haven't shown), or are you asking about the server code you have shown (which seems to work just fine, if you fix the import to `from socket import socket, AF_INET, SOCK_STREAM`). The server isn't doing any encoding or decoding of bytes at all, so if you're asking about that, it seems a bit of a non-sequitur. What do you think the difference is between a byte string and the "raw binary data" you have? – Blckknght Dec 25 '16 at 07:26
  • @Blckkknght i have added my encoder code above. Now that i see its not binary data in the first place its just a binary string. Does this save me any bandwidth ? I mean its just characters after all and i seem to get more characters than the actual message. – the_blank Dec 27 '16 at 07:08

2 Answers2

1

Take a look at https://docs.python.org/3/howto/unicode.html when trying to send raw 8-bit binary data, Python will encode it before sending it through the socket following this rule:

"if the value is < 128, it’s represented by the corresponding byte value. If the value is >= 128, it’s turned into a sequence of two, three, or four bytes, where each byte of the sequence is between 128 and 255."

Once you have managed to convert an array of 8-bit data into a string, send out the string using

socket.send_string(yourstring.encode('latin-1')
Unheilig
  • 16,196
  • 193
  • 68
  • 98
Jose
  • 11
  • 1
0

You need to send string values in bytecode, then you can do: socket.send(byte(message,'utf-8')) Or socket.send(message.encode())

If you want to send some plain text you can send directly: b'Spain' but if your text is in utf-8 (non-ascii) you must do: 'España'.encode()

Look this example of client, and watch .encode() and .decode() which uses UTF-8 by default:

#!/usr/bin/python3
import socket

s = socket.socket()
s.connect(("localhost", 9999))

while True:
    msg = input("> ")
    s.send(msg.encode())
    if msg == "quit":
        break
    received=s.recv(1024)
    print(received.decode())

print("Bye")

s.close()
robertlayton
  • 612
  • 1
  • 7
  • 20
Rutrus
  • 1,367
  • 15
  • 27