So, like the title says, I want to be able to convert between bytes loaded into memory as char* and uints. I have a program that demos some functions that seem to do this, but I am unsure if it is fully compliant with the c++ standard. Is all the casting I am doing legal and well defined? Am I handling sign extension, masking, and truncation correctly? I plan to eventually deploy this code to a variety of different platforms, sometimes with drastically different architectures, and everything I have tried so far seems to imply that this is valid cross platform code to serialize and deserialize my data, but I am more interested about what the standard says than whether or not this works on my particular machines. Here's the small test program to demo the conversion functions:
#include <type_traits>
#include <iostream>
#include <iomanip>
template<typename IntType>
IntType toUint( char byte ) {
static_assert( std::is_integral_v<IntType>, "IntType must be an integral" );
static_assert( std::is_unsigned_v<IntType>, "IntType must be unsigned" );
return static_cast<IntType>( byte ) & 0xFF;
}
template<typename IntType>
void printAs( signed char* cString, const int arraySize )
{
std::cout << "Values: [" << std::endl;
for( int i = 0; i < arraySize; i++ )
{
std::cout << std::dec << std::setfill('0') <<
std::setw(3) << toUint<IntType>( cString[i] ) <<
": " << "0x" << std::uppercase << std::setfill('0') <<
std::setw(16) << std::hex << toUint<IntType>( cString[i] );
if(i < (arraySize - 1) )
{
std::cout << ", ";
std::cout << std::endl;
}
}
std::cout << std::endl << "]" << std::endl;
}
template<typename IntType>
IntType cStringToUint( signed char* cString, const int arraySize )
{
IntType myValue = 0;
for( int i = 0; i < arraySize; i++ )
{
myValue <<= 8;
myValue |= toUint<IntType>( cString[i] );
}
return myValue;
}
template<typename IntType>
void printAsHex( IntType myValue )
{
std::cout << "0x" << std::uppercase << std::setfill('0') <<
std::setw(16) << std::hex << myValue <<std::endl;
}
int main()
{
const int arraySize = 9;
// assume Big Endian
signed char cString[arraySize] = {-1,2,4,8,16,-32,64,127,-128};
// convert each byte to a uint and print the value
printAs<uint64_t>( cString, arraySize );
// notice this trims leading MSB
printAsHex( cStringToUint<uint64_t>( cString, arraySize ) );
}
Which gives the following output with my compiler:
Values: [
255: 0x00000000000000FF,
002: 0x0000000000000002,
004: 0x0000000000000004,
008: 0x0000000000000008,
016: 0x0000000000000010,
224: 0x00000000000000E0,
064: 0x0000000000000040,
127: 0x000000000000007F,
128: 0x0000000000000080
]
0x02040810E0407F80
So, is this well defined and specified? Can I rest assured that I should get this output every time? I've tried to be thorough, but I would appreciate some second opinions on this at least, or preferably, cite the standard on how casting from char to uint and promoting to a wider type is well defined along with sign extension rules, if it is indeed well defined and specified? I really don't want to have to reach for boost just to do this in a cross platform way.
Also, feel free to assume that I will always be casting to a type of the same or wider width with this. Narrowing casts seem tricky, so I'm just ignoring them for now(I will probably eventually implement some kind of truncation similar to when I do static_cast<IntType>( byte ) & 0xFF;
in this code depending on the width of input and desired types.