3

Consider the following code:

import struct

x = b'example' # can be any bytes object
y = struct.pack(f'{len(x)}s', x)
print(x == y)

If I understand the documentation correctly, the function call will return the binary representation of a struct that has an array of len(x) chars (that contains the contents of x) as its only member. Is there any circumstance under which this will not just be x itself?

Or, to rephrase this question to the C perspective (since I also tagged this question that), does the C standard allow for more than one binary representation of struct MyStruct { char s[MY_SIZE]; };?

I am asking this because Mozilla is instructing me to do just that (Ctrl + F for “Python 3”):

# Encode a message for transmission, given its content.
def encode_message(message_content):
    encoded_content = json.dumps(message_content).encode("utf-8")
    encoded_length = struct.pack('=I', len(encoded_content))
    #  use struct.pack("10s", bytes), to pack a string of the length of 10 characters
    return {'length': encoded_length, 'content': struct.pack(str(len(encoded_content))+"s",encoded_content)}
Kelly Bundy
  • 23,480
  • 7
  • 29
  • 65
Ground
  • 43
  • 5
  • I'm not sure about Python. In C, `unsigned char [N]` has exactly one binary representation, but `char[N]` *might* have padding bits within each `char`, and `struct { anything; }` can have padding at the end. The extent to which Python's `struct` library implements these edge cases is one of the things I'm not sure of. – zwol Jul 24 '22 at 23:16
  • @zw I tried all possible `bytes` up to length 3, then 1000 random ones of length 1 million, then 1000 random ones for each length up to 1000. In all cases, the struct packing had no effect. – Kelly Bundy Jul 24 '22 at 23:21
  • In C, structure padding does not need to come only at the end, it can come between any members with unlike alignment requirements. – SoronelHaetir Jul 24 '22 at 23:24
  • @zwol If the `struct` library didn’t do exactly what the system’s C compiler is doing, that would be kind of self-defeating. I think you’re wrong about padding bits inside `char` ([check this](https://en.cppreference.com/w/c/language/object#Object_representation)). Also, since `alignof(char)` is guaranteed to be 1 ([source](https://en.cppreference.com/w/c/language/object#Alignment)), I don’t see how we could end up with padding at the end of the struct. – Ground Jul 24 '22 at 23:49
  • @SoronelHaetir Yes, but there cannot be padding at the _beginning_ of a `struct` (so a struct with one member can have padding _only_ at the end) and there cannot be padding in between the elements of an array. – zwol Jul 24 '22 at 23:51
  • 1
    @Ground (1) cppreference.com is not authoritative; assuming `char` is signed, I don't see anything in [section 6.2.6.1 or 6.2.6.2 of the standard](http://port70.net/~nsz/c/c11/n1570.html#6.2.6) that says it can't have padding bits. I will acknowledge that this is extremely unlikely, however. (2) `struct { char x; }` does not have to have the same alignment requirement as `char x`; there have been ABIs that said the minimum alignment for _all_ `struct` types was 4, for instance. – zwol Jul 24 '22 at 23:55

1 Answers1

2

It's entirely pointless. Whoever wrote that example probably didn't think it through or didn't quite understand what struct.pack does.

Particularly, even if you had an utterly bizarre platform that would actually pad struct MyStruct { char s[MY_SIZE]; };, the Python struct module still wouldn't pad that struct.pack call. struct.pack doesn't add trailing padding (or leading padding, if you had some really weird platform that would do that). Quoting the docs:

Padding is only automatically added between successive structure members. No padding is added at the beginning or the end of the encoded struct.

user2357112
  • 260,549
  • 28
  • 431
  • 505