0

I've a doubt here, i'm trying to use memcpy() to copy an string[9] to a unsigned long long int variable, here's the code:

unsigned char string[9] = "message";
string[8] = '\0';
unsigned long long int aux;

memcpy(&aux, string, 8);
printf("%llx\n", aux); // prints inverted data

/* 
 * expected: 6d65737361676565
 *  printed: 656567617373656d
 */

How do I make this copy without inverting the data?

Ricardo Mendes
  • 329
  • 1
  • 5
  • 13
  • That's how an `unsigned long long int` stores its data. You can't change that. If you want to store it in a different order, you'll need to rearrange the data before you copy it or shift the bits around after you copy it. – Jonathan Wood Oct 12 '18 at 19:45
  • 2
    Your machine is Little Endian. – Fiddling Bits Oct 12 '18 at 19:46
  • [Compose the value byte by byte](https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html). – Raymond Chen Oct 12 '18 at 19:46
  • Or if it is all about printing, use `%02hhx` to print individual characters from string. – Antti Haapala -- Слава Україні Oct 12 '18 at 19:47
  • Oh I see it now hahaha Anyway, I was going to convert the `string` to a `unsigned long long int` to manipulate the bits. There's going to be no problem with that right? I'm implementing the DES algorithm for a homework – Ricardo Mendes Oct 12 '18 at 20:05
  • Anything longer than 8 bytes must be split up into 8 byte chunks. Anything that isn't a multiple of 8 bytes must be appropriately handled. There probably won't be a problem if the communication will only happen on one system (or both/all systems have similar hardware/software), but there could be problems if the encryption is done on a little-endian machine and the decryption is done on a big-endian machine, or vice versa. Also, AFAIK, the DES algorithm stipulates how the text is supposed to be handled without relying on any endianness. Be cautious; read carefully; test thoroughly. – Jonathan Leffler Oct 12 '18 at 20:20
  • 1
    Before learning C, you should come to learn that each system has an internal representation for integers (endianness, padding bytes, sign representation, traps, etc). Heck, it's possible that an `unsigned long long int` might be *four* bytes (where `CHAR_BIT` is 16, a `char` is 16 bits and thus so are bytes, C permits this)... and here you are copying *eight* into it... you're dabbling in undefined behaviour, which probably means you could use a book to guide you away from writing non-portable, unstable and difficult to debug code. – autistic Oct 12 '18 at 20:20

2 Answers2

5

Your system is using little endian byte ordering for integers. That means that the least significant byte comes first. For example, a 32 bit integer would store 258 (0x00000102) as 0x02 0x01 0x00 0x00.

Rather than copying your string into an integer, just loop through the characters and print each one in hex:

int i;
int len = strlen(string);
for (i=0; i<len; i++) {
    printf("%02x ", string[i]);
}
printf("\n");

Since string is an array of unsigned char and you're doing bit manipulation for the purpose of implementing DES, you don't need to change it at all. Just use it as it.

dbush
  • 205,898
  • 23
  • 218
  • 273
5

Looks like you've just discovered by accident how CPUs store integer values. There's two competing schools of thought that are termed endian, with little-endian and big-endian both found in the wild.

If you want them in byte-for-byte order, an integer type will be problematic and should be avoided. Just use a byte array.

There are conversion functions that can go from one endian form to another, though you need to know what sort your architecture uses before converting properly.

So if you're reading in a binary value you must know what endian form it's in in order to import it correctly into a native int type. It's generally a good practice to pick a consistent endian form when writing binary files to avoid guessing, where the "network byte order" scheme used in the vast majority of internet protocols is a good default. Then you can use functions like htonl and ntohl to convert back and forth as necessary.

tadman
  • 208,517
  • 23
  • 234
  • 262