Python 3.1.3 ctypes.structure does not order bits properly and unexpectedly modifies data

Question

I defined the following structure

import ctypes
from ctypes import *
class r( BigEndianStructure ):
    _fields_ = [( "d0", c_uint32, 28 ),
                ( "d1", c_uint32, 18 ),
                ( "d2", c_uint32, 18 ),
                ( "d3", c_uint16, 16 ),
                ( "d4", c_uint16, 16 ), ]

then tested with the following code

a = r(0xAAAAAAAA,0xBBBBBBBB,0xCCCCCCCC,0xDDDD,0xEEEE)
for byte in string_at( addressof( a ), sizeof( a ) ):
    print(hex(byte),end="")

result is

0xaa 0xaa 0xaa 0xa0 0xee 0xee 0xc0 0x0 0x33 0x33 0x0 0x0 0xdd 0xdd 0xee 0xee

the expected result was

0xaa 0xaa 0xaa 0xa0 0xbb 0xbb 0xc0 0x0 0xcc 0xcc 0xc0 0x0 0xdd 0xdd 0xee 0xee

not only the structure was not compacted, the result data is different than what was entered. did I made any mistake or Python likes to modify data with its own mind?

Jeremy Brown · Answer 1 · 2011-02-03T21:10:21.793

0

The fields are left aligned (padded to the right)

Field d0 -

(AAAA AAAA & FFFFFFF (28 bits)) << (32 - 28 = 4) = AAAA AAA0

Field d1 -

(BBBB BBBB & 3FFFF (18 bits)) << (32 - 18 = 14)  = EEEE C000

Field d2 -

(CCCC CCCC & 3FFFF (18 bits)) << (32 - 18 = 14) = 3333 0000

Fields d1 & d2 will not both fit within a 32 bit field - so d2 is aligned in the next 32 bit slot.

Illustrative step-by-step example for field d1:

BBBB BBBB & 3FFFF (only least-significant 18 bits kept) = 3BBBB

3BBBB << 14 (pad the last 14 bits) = EEEE C000

edited Feb 03 '11 at 21:10

answered Feb 03 '11 at 21:04

Jeremy Brown

17,880
4
35
28

don't you think left aligning bits (put bits to the left) + taking least significant bits (bits on the right) are two contradicting way to process input data? – SUCM Feb 03 '11 at 22:22
I don't understand your point. Bit fields are mapped in order. If you had 2 consecutive bit fields defined that could both fit in the container type (c_uint32), then you would expect field2 to follow field1 with no padding within that container. Instead, since the next field does not fit in that boundary it is aligned to the next one - thus the padding to the right. Since the struct is ordered as big endian, it's even more apparent that the rightmost bits are padded. – Jeremy Brown Feb 03 '11 at 22:51
As for taking the least significant bits - that seems like a perfectly fine sanity check to me since you are assigning a value that is larger than the bit mask representing that bit field. In C, wouldn't you expect a narrowing-cast of a 64 bit unsigned int to a 32 bit unsigned int to only use the least significant bits? – Jeremy Brown Feb 03 '11 at 22:52
my point is simple. if you can compact bits, compact them. if you can't don't even try. look at my structure. if Python can't compact bits without extra 0 paddings, why bother pack them into none sense that I have to deal with padding plus Python's failure to compact? – SUCM Feb 04 '11 at 00:15
1

@SUCM I don't think you understand how bitfields work in C. Your gripes are with the C standard and the various architecture/platform-specific implementations - not Python or even ctypes. If you want control over how data is packed, then you need to do the bitwise operations yourself. – Jeremy Brown Feb 04 '11 at 01:24
Can you explain why the data is lost in the following code if Python is complying to bitfield implementation in C? I'm trying to use structure properly, but it's not working in the same way as C import ctypes from ctypes import * class r( BigEndianStructure ): _fields_ = [( "d0", c_uint16, 16 ), ( "d1", c_uint16, 16 ), ( "d2", c_uint64, 48 ), ( "d3", c_uint16, 16 ), ] b = r(0xAAAA,0xBBBB,0xCCCCCCCCDDDD,0xEEEE) for byte in string_at( addressof( b ), sizeof( b ) ): print(hex(byte),"",end="") – SUCM Feb 04 '11 at 02:44
For good measure, here is some reading material on bitfields - http://stackoverflow.com/questions/1490092/c-c-force-bit-field-order-and-alignment – Jeremy Brown Feb 04 '11 at 15:01
I guess until bug is fixed nothing would work as expected with structure – SUCM Feb 04 '11 at 18:17

score 0 · Answer 2 · answered Feb 03 '11 at 21:10

It looks like the issue comes from storing the 18 bit width values in 32 bits, and then interpreting them as full 32 bit values.

Lets look at what is happening with 0xBBBBBBBB:

0xBBBBBBBB = 10111011101110111011101110111011b
0xBBBBBBBB & 3FFFF (bit width of 18) = 111011101110111011b

11101110111011101100000000000000b = 0xEEEEC000

Basically when reading the memory like this, instead of getting the 18 bit mask of the value you expect, you are getting a 14 bit shift.

from your explaination, Python allocates non-compact structures, then tries to compact bits in structure and failed on both. this behaviour is contradicting itself. I wonder why Python tries to do everything and failed at everything — SUCM, Feb 03 '11 at 22:17

Mark Tolonen · Answer 3 · 2011-02-04T07:12:19.583

Use bit fields that fit in a container type to avoid alignment padding. In the example below, 4+8+16 fit in c_uint32, but 4+8+16+5 does not, so d3 aligns in the next c_uint32:

from ctypes import *
class r( BigEndianStructure ):
    _fields_ = [('d0',c_uint32, 4),
                ('d1',c_uint32, 8),
                ('d2',c_uint32,16),
                ('d3',c_uint32, 5)]

def fld(n):
    return '[' + '-'*(n-2) + ']'

def pad(n):
    return '.'*n

print(fld(4),fld(8),fld(16),pad(4),fld(5),pad(27),sep='')

for i in range(1,17):
    v = 2**i-1
    a = r(v,v,v,v)
    for byte in string_at( addressof( a ), sizeof( a ) ):
        print('{0:08b}'.format(byte),end='',sep='')
    print()

Output

Binary output makes the numbers easier to visualize. Note that the 5-bit field couldn't fit in the remaining 4 bits of the first c_uint32, so 4 bits of padding were added to start the 5-bit field in the next c_uint32.

[--][------][--------------]....[---]...........................
0001000000010000000000000001000000001000000000000000000000000000
0011000000110000000000000011000000011000000000000000000000000000
0111000001110000000000000111000000111000000000000000000000000000
1111000011110000000000001111000001111000000000000000000000000000
1111000111110000000000011111000011111000000000000000000000000000
1111001111110000000000111111000011111000000000000000000000000000
1111011111110000000001111111000011111000000000000000000000000000
1111111111110000000011111111000011111000000000000000000000000000
1111111111110000000111111111000011111000000000000000000000000000
1111111111110000001111111111000011111000000000000000000000000000
1111111111110000011111111111000011111000000000000000000000000000
1111111111110000111111111111000011111000000000000000000000000000
1111111111110001111111111111000011111000000000000000000000000000
1111111111110011111111111111000011111000000000000000000000000000
1111111111110111111111111111000011111000000000000000000000000000
1111111111111111111111111111000011111000000000000000000000000000

Python 3.1.3 ctypes.structure does not order bits properly and unexpectedly modifies data

3 Answers3

Output