0

I am trying to decode UDP packet data from an application which encoded the data using Qt's QDataStream methods, but having trouble when trying to decode string fields. The docs say the data was encoded in utf8. The python QDataStream module only has a readQString() method. Numbers seem to decode fine, but the stream pointer gets messed up when the first strings decode improperly.

How can i decode these UTF8 Strings?

I am using some documentation from the source project interpret the encoding: wsjtx-2.2.2.tgz NetworkMessage.hpp Description in the header file

Header:
   32-bit unsigned integer magic number 0xadbccbda
   32-bit unsigned integer schema number

There is a status message for example with comments like this:

Heartbeat     Out/In    0                       quint32
                             Id (unique key)        utf8
                             Maximum schema number  quint32
                             version                utf8
                             revision               utf8

example data from the socket when a status message is received:

b'\xad\xbc\xcb\xda\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x06WSJT-X\x00\x00\x00\x03\x00\x00\x00\x052.1.0\x00\x00\x00\x0624fcd1'

def jt_decode_heart_beat(i):
    """
    Heartbeat     Out/In    0                      quint32
                             Id (unique key)        utf8
                             Maximum schema number  quint32
                             version                utf8
                             revision               utf8
    :param i: QDataStream
    :return: JT_HB_ID,JT_HB_SCHEMA,JT_HB_VERSION,JT_HB_REVISION
    """
    JT_HB_ID = i.readQString()
    JT_HB_SCHEMA = i.readInt32()
    JT_HB_VERSION = i.readQString()
    JT_HB_REVISION = i.readQString()
    print(f"HB:ID={JT_HB_ID} JT_HB_SCHEMA={JT_HB_SCHEMA} JT_HB_VERSION={JT_HB_VERSION} JT_HB_REVISION={JT_HB_REVISION}")
    return (JT_HB_ID, JT_HB_SCHEMA, JT_HB_VERSION, JT_HB_REVISION)

while 1:
    data, addr = s.recvfrom(1024)
    b = QByteArray(data)
    i = QDataStream(b)
    JT_QT_MAGIC_NUMBER  = i.readInt32()
    JT_QT_SCHEMA_NUMBER = i.readInt32()
    JT_TYPE = i.readInt32()

    if JT_TYPE == 0:
        # Heart Beat
        jt_decode_heart_beat(i)
    elif JT_TYPE == 1:
        jt_decode_status(i)
  • Reviewed docs a bit more and found this comment: "Type utf8 is a utf-8 byte string formatted as a QByteArray for serialization purposes (currently a quint32 size followed by size bytes, no terminator is present or counted). I will try this out tomorrow, 0039.. – baler1992 Aug 07 '20 at 07:38

1 Answers1

0

Long story short the wsjtx udp protocol I was reading did not encode the strings using the the QDataString type, so it was wrong to expect that i.readQString() would work.

Instead the data was encoded using a QInt32 to define the string length, followed by the UTF8 characters encoded in QByteArray.

I successfully encapsulated this functionality in a function:

def jt_decode_utf8_str(i): """ strings are encoded with an int 32 indicating size and then an array of bytes in utf-8 of length size :param i: :return: decoded string """ sz = i.readInt32() b = i.readRawData(sz) return b.decode("utf-8")