24

I am trying to use unix sockets to test sending some udp packets to localhost.

It is my understanding that when setting ip address and port in order to send packets, I would fill my sockaddr_inwith values converted to network-byte order. I am on OSX and I'm astonished that this

printf("ntohl: %d\n", ntohl(4711));
printf("htonl: %d\n", htonl(4711));
printf("plain: %d\n", 4711);

Prints

ntohl: 1729232896
htonl: 1729232896
plain: 4711

So neither function actually returns the plain value. I would have expected to see either the results differ, as x86 is little-endian (afaik), or be identical and the same as the actual number 4711. Clearly I do not understand what htonl and ntohl and their variants do. What am I missing?

The relevant code is this:

int main(int argc, char *argv[])
{
   if (argc != 4)
   {
      fprintf(stderr, "%s\n", HELP);
      exit(-1);
   }

   in_addr_t rec_addr = inet_addr(argv[1]); // first arg is '127.0.0.1'
   in_port_t rec_port = atoi(argv[2]);      // second arg is port number
   printf("Address is %s\nPort is %d\n", argv[1], rec_port);
   char* inpath = argv[3];

   char* file_buf;
   unsigned long file_size = readFile(inpath, &file_buf); // I am trying to send a file
   if (file_size > 0)
   {
      struct sockaddr_in dest;
      dest.sin_family      = AF_INET;
      dest.sin_addr.s_addr = rec_addr; // here I would use htons
      dest.sin_port        = rec_port;
      printf("ntohs: %d\n", ntohl(4711));
      printf("htons: %d\n", htonl(4711));
      printf("plain: %d\n", 4711);
      int socket_fd = socket(AF_INET, SOCK_DGRAM, 0);
      if (socket_fd != -1)
      {
         int error;
         error = sendto(socket_fd, file_buf, file_size + 1, 0, (struct sockaddr*)&dest, sizeof(dest));
         if (error == -1)
            fprintf(stderr, "%s\n", strerror(errno));
         else printf("Sent %d bytes.\n", error);
      }
   }

   free(file_buf);
   return 0;
}
oarfish
  • 4,116
  • 4
  • 37
  • 66

4 Answers4

32

As others have mentioned, both htons and ntohs reverse the byte order on a little-endian machine, and are no-ops on big-endian machines.

What wasn't mentioned is that these functions take a 16-bit value and return a 16-bit value. If you want to convert 32-bit values, you want to use htonl and ntohl instead.

The names of these functions come from the traditional sizes of certain datatypes. The s stands for short while the l stands for long. A short is typically 16-bit while on older systems long was 32-bit.

In your code, you don't need to call htonl on rec_addr, because that value was returned by inet_addr, and that function returns the address in network byte order.

You do however need to call htons on rec_port.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • If I try to send packets between two programs running on localhost, and treat `rec_port` with `htons`, 8080 turns into 36895. – oarfish May 09 '16 at 19:08
14

"Network byte order" always means big endian.

"Host byte order" depends on architecture of host. Depending on CPU, host byte order may be little endian, big endian or something else. (g)libc adapts to host architecture.

Because Intel architecture is little endian, this means that both functions are doing the same: reversing byte order.

gudok
  • 4,029
  • 2
  • 20
  • 30
  • 3
    "Network byte order" evolved to mean _big endian_. It was not alway that way. It is extraordinarily common now. https://en.wikipedia.org/wiki/Endianness#Networking – chux - Reinstate Monica Apr 28 '16 at 20:29
12

Both functions reverse the bytes' order (on little-endian machines). Why would that return the argument itself?

Try htons(ntohs(4711)) and ntohs(htons(4711)).

Michel de Ruiter
  • 7,131
  • 5
  • 49
  • 74
  • 1
    Um…right. I guess the real question is then why I am unable to send anything to localhost unless I use the plain ip address, but that is a separate issue. – oarfish Apr 28 '16 at 20:17
  • 4
    "reverse the bytes" IF they have to be reversed on your system... You can guess that both of them or non of them reverses the bytes... depends on your arch. – Sandburg Jul 10 '19 at 07:30
9

these functions are poorly named. Host to network and network to host are actually the same thing and really should be called 'change endianness if this is a little endian machine'

So on a little endian machine you do

net, ie be, number = htonl / ntohl (le number)

and send the be number on the wire. And when you get a big endian number from the wire

le num = htonl/ntohl (net ,ie be, number)

on a big end machine

net, ie be, number = htonl / ntohl (be number)

and

 be num = htonl/ntohl (net ,ie be, number)

and in the last cases you see that these functions do nothing

pm100
  • 48,078
  • 23
  • 82
  • 145
  • 1
    Well, be careful. Although there are few, if any, machines these days that use a byte order different from both little-endian and big-endian, there have been such machines in the past, with some prominence. It is at least conceivable that there are present or future machines where, say, `htonl()` is not its own inverse. POSIX engineered these functions to be able to handle that if they should ever need to do. – John Bollinger Apr 28 '16 at 20:28
  • ok then it should be called 'change endianness to big if this is not a big endian machine' – pm100 Apr 28 '16 at 20:36
  • 11
    Even if the two functions do the same thing, having different names for them is still useful because it clarifies the code's intent -- i.e. if I see x=ntohl(blah) I know that the value assigned to x will be semantically meaningful, whereas if I see x=htonl(blah) I know that the value of x is not one that I can rely on to be native-endian. If the code was just x=maybe_endian_swap(blah), then it would not be obvious to me whether I can e.g. printf("%i\n", x); afterwards and see a semantically meaningful value printed. – Jeremy Friesner Apr 28 '16 at 20:54
  • 4
    Endianness is not a point of view of origin, it's a point of view of destination: https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html – Sandburg Sep 17 '18 at 13:43