Porting C endianness & pointers black magic to Swift

Question

I'm trying to translate this snippet :

ntohs(*(UInt16*)VALUE) / 4.0

and some other ones, looking alike, from C to Swift. Problem is, I have very few knowledge of Swift and I just can't understand what this snippet does... Here's all I know :

ntohs swap endianness to host endianness
VALUE is a char[32]
I just discovered that Swift : (UInt(data.0) << 6) + (UInt(data.1) >> 2) does the same thing. Could one please explain ?
I'm willing to return a Swift Uint (UInt64)

Thanks !

Casting a `char` pointer (or array) to `UInt16*` is rather unsafe, and it can lead to undefined behavior. — barak manos, Jun 20 '15 at 19:31
That's no pointer black magic. It's just a reinterpreting cast. You should see some real pointer magic... — cmaster - reinstate monica, Jun 20 '15 at 19:35
@cmaster It's undefined behavior, so it's not the good kind of magic ;-) — zwol, Jun 20 '15 at 20:20
@zwol I agree that it's not written in the best way possible. However, it's only undefined behavior if you can't guarantee alignment of the `char` buffer that's used. If that buffer is allocate with `malloc()`, it's sufficiently aligned and no UB is caused. Things are different if the buffer is on the stack, though: the compiler is not required to align the `char` array in any way, which can indeed lead to UB of the worst kind in theory. In practice, that UB likely won't manifest: there's an excellent chance that the buffer ends up aligned, and most CPUs can handle misaligned reads of 16 bits. — cmaster - reinstate monica, Jun 20 '15 at 20:30
@cmaster It is undefined behavior per the type-based aliasing rules, regardless of where `VALUE` is allocated or how it is aligned. The only way it would not be UB is if `VALUE` itself had been converted *from* a pointer to (array-of-)`UInt16` earlier, and the data pointed to had not been modified in between the two conversions. But if that were the case there would have been no need to call `ntohs`. — zwol, Jun 20 '15 at 20:32
@zwol There is an explicit exception in the strict aliasing rules for casts to/from `char` types. Otherwise `memmove()` and friends would be impossible to implement in a standard compliant way. — cmaster - reinstate monica, Jun 21 '15 at 05:19
@cmaster Yes, but it's **asymmetric**. You can access any object via a pointer-to-`char`, but you can **not** access an array of `char` via a pointer to some other type. — zwol, Jun 21 '15 at 14:34
@zwol Ah, yes, you are right, that slight asymmetry escaped my notice. Mainly because the definition of a memory object is purely a language construct. Of course, that asymmetry allows the standard to dodgo the alignment issue. But it leaves the question of how `calloc()` is supposed to work according to the standard: `calloc()` has to treat the memory object that it returns as a `char` array, which is then reinterpreted via a `void*` to some other memory object. According to the asymmetric strict aliasing rules `if(*(int*)calloc(sizeof(int))) ...` seems to be undefined behavior... — cmaster - reinstate monica, Jun 21 '15 at 15:19
@cmaster It isn't, but you have to read the "effective type" rules (C99 §6.5p6,7) very carefully to understand why it isn't. The object pointed-to by the value returned from `calloc` is an "object having no declared type" which has *not* been "stored into ... through an lvalue having a type that is not a character type", so the "effective type" of your `int` read is in fact `int` and you're OK. — zwol, Jun 21 '15 at 16:36
@cmaster ... Hmm, actually it is starting to sound like `unsigned char *foo = calloc(sizeof(int)); foo[0] = 0xAA; int val = *(int *)foo;` *does* have well-defined behavior -- but if that had been `static unsigned char foo[sizeof(int)]` instead it wouldn't be. Oh my aching head. — zwol, Jun 21 '15 at 16:41
@cmaster I've hoisted this tangent to its own question: https://stackoverflow.com/questions/30967447/ub-on-reading-object-using-non-character-type-when-last-written-using-character — zwol, Jun 21 '15 at 17:49

Sulthan · Accepted Answer · 2015-06-20T22:18:37.227

7

VALUE is a pointer to 32 bytes (char[32]).
The pointer is cast to UInt16 pointer. That means the first two bytes of VALUE are being interpreted as UInt16 (2 bytes).
* will dereference the pointer. We get the two bytes of VALUE as a 16-bit number. However it has net endianness (net byte order), so we cannot make integer operations on it.
We now swap the endianness to host, we get a normal integer.
We divide the integer by 4.0.

To do the same in Swift, let's just compose the byte values to an integer.

let host = (UInt(data.0) << 8) | UInt(data.1)

Note that to divide by 4.0 you will have to convert the integer to Float.

edited Jun 20 '15 at 22:18

answered Jun 20 '15 at 19:31

Sulthan

128,090
22
218
270

EDIT : We were writing at the same time. Your solution looks nicer and safer than mine (using UnsafePointer), so I'll gladly take yours. thanks also for the nice explanations ! – Perceval Jun 20 '15 at 19:38
1

Not a pointer to 32 bytes. Pointer to an area where a Uint16 is expected to be found. That is an unsigned int on 16 bits (2 bytes) in network order. Its value is divided by 4.0. – Michel Billaud Jun 20 '15 at 19:46
2

If `data.0` contains the MSB and `data.1` the LSB of a 16-bit integer, then `(UInt(data.0) << 8) | UInt(data.1)` already is the correct number and must *not* be byte-order converted. – Martin R Jun 20 '15 at 20:17
@MartinR You are absolutely right. I had to be tired. – Sulthan Jun 20 '15 at 22:20

zwol · Answer 2 · 2015-06-20T20:41:22.600

The C you quote is technically incorrect, although it will be compiled as intended by most production C compilers.¹ A better way to achieve the same effect, which should also be easier to translate to Swift, is

unsigned int val = ((((unsigned int)VALUE[0]) << 8) |  // ² ³
                    (((unsigned int)VALUE[1]) << 0));  // ⁴

double scaledval = ((double)val) / 4.0;                // ⁵

The first statement reads the first two bytes of VALUE, interprets them as a 16-bit unsigned number in network byte order, and converts them to host byte order (whether or not those byte orders are different). The second statement converts the number to double and scales it.

¹ Specifically, *(UInt16*)VALUE provokes undefined behavior because it violates the type-based aliasing rules, which are asymmetric: a pointer with character type may be used to access an object with any type, but a pointer with any other type may not be used to access an object with (array-of-)character type.

² In C, a cast to unsigned int here is necessary in order to make the subsequent shifting and or-ing happen in an unsigned type. If you cast to uint16_t, which might seem more appropriate, the "usual arithmetic conversions" would then convert it to int, which is signed, before doing the left shift. This would provoke undefined behavior on a system where int was only 16 bits wide (you're not allowed to shift into the sign bit). Swift almost certainly has completely different rules for arithmetic on types with small ranges; you'll probably need to cast to something before the shift, but I cannot tell you what.

³ I have over-parenthesized this expression so that the order of operations will be clear even if you aren't terribly familiar with C.

⁴ Left shifting by zero bits has no effect; it is only included for parallel structure.

⁵ An explicit conversion to double before the division operation is not necessary in C, but it is in Swift, so I have written it that way here.

score -2 · Answer 3 · answered Jun 20 '15 at 19:26

-2

It looks like the code is taking the single byte value[0]. This is then dereferenced, this should retrieve a number from a low memory address, 1 to 127 (possibly 255). What ever number is there is then divided by 4.

I genuinely can't believe my interpretation is correct and can't check that cos I have no laptop. I really think there maybe a typo in your code as it is not a good thing to do. Portable, reusable

I must stress that the string is not converted to a number. Which is then used

answered Jun 20 '15 at 19:26

phil

561
3
10

1

`UInt16` are 2 bytes, not 1 byte. There is no string. – Sulthan Jun 20 '15 at 19:27
@Sulthan is right. Moreover... there is definitely *no* typo. It's production code, battle-tested and used by many, many open source projects, and probably some other paid ones too ;-) – Perceval Jun 20 '15 at 19:36

Porting C endianness & pointers black magic to Swift

3 Answers3