6

So I was thinking about languages the other day, and it struck me that any program written in a compiled language that interacts with the Internet is then translated into assembly that has to interact with the Internet. I've just begun learning a bit of x86 assembly to help me understand C++ a bit better, and I'm baffled by how something so low-level could do something like access the Internet.

I'm sure the full answer to this question is much more than would fit in a SO answer, but could somebody give me maybe a basic summary?

Maulrus
  • 1,787
  • 2
  • 17
  • 27
  • 4
    Here's a similar question. http://stackoverflow.com/questions/1837582/how-to-write-to-read-from-network-card-in-x86-assembly Beyond that, you have to remember that "accessing the internet" is just sending data formatted into TCP/IP messages over a wire to another computer. It was sort of an epiphany for me too when I realized there was nothing that magical about it. – Bill Prin Apr 12 '10 at 02:04
  • 2
    The whole internet runs on 1's and 0's, or rather, the whole internet is the product of an incredible number of machine code snippets interacting. I find that arguably more astonishing. :) – deceze Apr 12 '10 at 02:05

4 Answers4

14

User-space programs that "interact with the internet", in all modern systems, do so by issuing system calls to the underlying operating system, which supplies the API for a TCP/IP stack.

The system calls in question (such as socket, listen, accept, and so forth) are typically documented at a C level, but in each particular OS implementation they will translate to machine code, of course. But whether values go in particular registers, or locations in memory pointed to by particular registers, etc, is pretty minor and totally system-specific.

If you're wondering how the machine code (probably also compiled from C) in the kernel and device drivers "interacts with the internet" (in response to system calls), it does so both by building and maintaining in-memory data structures to track the state of various things, and by interacting with the underlying hardware (e.g. via interrupts, I/O ports, memory mapped device areas, or whatever that particular architecture uses) -- just like it interacts with (say) a video display, or a disk device.

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • 1
    yeah... it's hard to imagine such a small user-mode program doing complex stuff like that, but really it's just making a call to a library that was written in something higher-level (C/C++) which is then translated by the compiler into a BUNCH of assembly code, which, at some really low level, sends a bunch of 5-volt HIGHs and 0-volt LOWs through an ethernet cable to the other side of the world... (if I understand correctly) – Adam Apr 12 '10 at 02:03
  • 1
    Well technically the physical Ethernet pulses only go as far as your router. The router is then responsible for retransmitting them on your behalf to the next router, and the next router is responsible for retransmitting them, etc. etc. – Tyler McHenry Apr 12 '10 at 02:48
2

It depends. When you read about a web script written in C, it's actually a CGI program. CGI is a protocol, not a language. CGI specifies to put "GET", "POST", etc. into REQUEST_METHOD, "foo=bar?baz=42" into QUERY_STRING, post data into stdin, etc.. To access these, the CGI program uses system calls. The web server uses CGI to communicate with a web script. A program that communicates across the Internet by itself can use the system sockets API.

In summary, the operating system does all the communicating. The program just makes the right system calls.

If you wonder how the operating system communicates over the Internet, the answer is that the OS kernel uses a driver to interface with the network card over an IO port, memory-mapped IO, etc.. The OS and network card implement Internet Protocol standards for everything to work together.

Joey Adams
  • 41,996
  • 18
  • 86
  • 115
0

What you need to do is to look up some of those PIC web-server projects. Some of them are web-servers written in assembly and running on 8-bit hardware. It will give you a clear idea of how something as low-level as assembly can be used to interact with the rest of the world through the Internet.

It basically involves

  1. Writing some low-level drivers (Layer 2) to interface with the networking hardware - this may be using ethernet or even modems (with SLIP).
  2. Write the next layers - IP and TCP - to process the TCP/IP packets. This will need some assembly magic as these processes are quite involved.
  3. Write the application layer - whether it be a web-server or client or whatever - that exploits the underlying layers.

Hope this clears up some doubt.

sybreon
  • 3,128
  • 18
  • 19
0

Is it reasonable to say that at some point regardless of the program, code gets transformed(for lack of the proper term) into some form of "assembly" language (I think there is more than one) which then has a "one to one" relationship to machine code? Not sure how .NET and ILASM/Java and its corresponding bytecode fit into this, but I thought all of it at some point turned into assembly and then machine code.

CDUB
  • 49
  • 3