Asm like that is how C structs work; casting to a struct foo*
with the layout you want is one option.
Beware of strict-aliasing and alignment rules in C if you want to do overlapping accesses with different types. C is not portable assembly language; you have to obey C's rules if you don't want a compiler to "break" your code at compile time. (Or from another PoV, for your code not to mean what you want and not be already broken.) e.g. Why does unaligned access to mmap'ed memory sometimes segfault on AMD64? shows that alignment-related segfaults are possible after compiler optimization even on an ISA like x86-64 where normal scalar unaligned loads/stores never fault.
Use memcpy(ptr+3, &u32value, sizeof(u32value));
for an unaligned 4-byte store that you can safely reload with a different type. (For a var type like uint32_t u32value = 1234;
- uin32_t is required to be exactly 32 bits wide with no padding, if the system provides it.)
With a decent modern compiler targeting an ISA (like x86-64) where unaligned loads are cheap, that memcpy
will fairly reliably inline and optimize to a single mov
store with the right operand size.
*(uint32_t)(ptr+3) = 1234
would not be safe, because alignof(uint32_t)
is 4 on most C implementations (ones with 8-bit char). Since malloc itself returns memory sufficiently aligned to store any type (up to max_align_t
), ptr+3
is definitely misaligned (except on weird systems with wide char where sizeof(uint32_t)=1, or where it chooses not to require alignment of wider types).
Or in GNU C we can use type attributes to tell the compiler about under-aligned versions of types, and/or give them the same alias-anything ability as char*
.
(__m128i
and other Intel vector intrinsic types use the may_alias
attribute; that's how compilers make it safe to _mm_load_ps( (float*)ptr_to_int )
and stuff like that.)
typedef uint32_t unaligned_aliasing_u32 __attribute__((aligned(1),may_alias));
char *ptr = malloc(1234);
*(unaligned_aliasing_u32)(ptr+3) = 5678; // movl $5678, 3(%rax)
*(unaligned_aliasing_u32)(ptr+5) = 5678; // movl $5678, 5(%rax) overlapping
int tmp = ((const uint32_t*)ptr)[1]; // mov 4(%rax), %edx aligned so I used plain uint32 for example.
See part of my answer on Why does glibc's strlen need to be so complicated to run quickly? for another example of using a may_alias load to safely read a buffer as long
s, unlike glibc's version which is only "safe" because it's compiled with known compilers and without the possibility of inlining into a caller that loads or stores the same memory with a different type. (That one does need to be aligned so I omitted the aligned(1)
attribute.)