12

I used tcp to send a data to python server. The data is like:

struct protocol
{
    unsigned char prot;
    int id;
    char name[32];
}

Look at the name field, it is a null terminated string max size is 32. Now I use strcpy.

protocol p;
memset(&p, 0, sizeof(p));
strcpy(name, "abc");

Now I unpack it using python.

prot,id,name = struct.unpack("@Bi32s")

Now the len(name) is 32. But I need get the string of "abc" when the length is 3.

How can I do that?

martineau
  • 119,623
  • 25
  • 170
  • 301
tjhack
  • 151
  • 1
  • 1
  • 4

4 Answers4

10

After the unpacking you can just do a:

name = name.split('\0', 1)[0]

Alternatively you could use the ctypes module:

name = ctypes.create_string_buffer(name).value
martineau
  • 119,623
  • 25
  • 170
  • 301
  • I would do: `name = name.split('\0', 1)[0]` – falsetru Sep 26 '14 at 13:16
  • +1 I think this is probably the most readable way to do it. It's probably not as efficient as as falsetru or my answer, but for something 32 characters long it probably doesn't matter – loopbackbee Sep 26 '14 at 13:17
  • @falsetru: Yes, adding a `maxsplit` argument would make it slightly more efficient (and is a good suggestion, thanks). – martineau Sep 26 '14 at 13:20
  • with ctypes, `1000000 loops, best of 3: 1.13 µs per loop`, with split, `1000000 loops, best of 3: 385 ns per loop` split is much faster in this case. – fx-kirin Dec 07 '15 at 11:44
  • @fx-kirin: With a max string size of 32, it still may not matter much in the scheme of things — some think ["premature optimization is the root of all evil"](https://en.wikipedia.org/wiki/Program_optimization#When_to_optimize) — but it's nice to know anyway. – martineau Dec 07 '15 at 21:39
  • 2
    In my case, I had to use a binary string for splitting, i.e. ``mystr = mybuff.split(b'\0', 1)[0]``. Otherwise I get: ``TypeError: a bytes-like object is required, not 'str'``. – djlauk Apr 03 '17 at 09:45
  • 1
    @djlauk: In Python 3, `struct.unpack()` returns `bytes` or `bytearray` objects which are different from strings in Python 2—so that's why that would be necessary. You should have mentioned that in your question and tagged it as "python-3.x" not generic "python". – martineau Apr 03 '17 at 10:18
  • 1
    @martineau: Thanks for the insight. I just recently started migrating from python 2 to 3, so I sometimes stumble over such details. But, just to be clear: I didn't ask the question (and hence didn't tag it). Nevertheless your hint regarding tagging appropriately is surely sound advice to anyone, including me, and I probably wouldn't have thought of it. – djlauk Apr 03 '17 at 11:01
  • @djlauk: Sorry for the confusion—it was around 4 am my time, and your username and the OP's were too similar for my bleary eyes to tell apart. – martineau Apr 03 '17 at 13:26
4

Simply get the substring up to the first \0:

prot,id,name = struct.unpack("@Bi32s")
name= name[:name.index("\0")]

This has the particularity that it will check and fail (throw ValueError) if no \0 appears inside the string.

loopbackbee
  • 21,962
  • 10
  • 62
  • 97
3

Partition it with null character ('\0') after the unpack:

>>> prot, id, name = struct.unpack('@Bi32s', b'\0\0\0\0\0\0\0\0abc' + b'\0' * 29)
>>> name, _, _ = name.partition('\0')
>>> name
'abc'

Alternative using ctypes:

>>> from ctypes import *
>>>
>>> class Protocol(Structure):
...     _fields_ = [("prot", c_char),
...                 ("id", c_int),
...                 ('name', c_char * 32)]
...
>>> # sock.recv_into(buf) in real program
... buf = create_string_buffer(b'\0\0\0\0\0\0\0\0abc' + b'\0' * 29)
>>> p = cast(buf, POINTER(Protocol))
>>> p[0].name
'abc'
falsetru
  • 357,413
  • 63
  • 732
  • 636
1

How about using rstrip to remove the padding?

prot,id,name = struct.unpack("@Bi32s")
name = name.rstrip(b'\0')

This will also let you use all 32 bytes to store the name without a zero terminator. However, this does rely on ALL padding bytes being set to zero.

alex.forencich
  • 1,355
  • 11
  • 16