5

I am trying to learn how the whole build chain works so I can better understand what goes on when I do build/link/compile etc.

One point I am having trouble with is this: If the compiler turns the source into native assembly, why can't the same program run on different OSs? Isn't assembly run directly by the CPU? So the same machine code should run on every OS, as long as it is the same architecture, no? Why not?

EDIT: Most of the answers so far are about calling the OS's APIs. That obviously is a problem. My question is about the straight machine code. Does it get passed straight to the CPU or not? If I wrote a program in assembly, would I still need to compile separately for each OS? (side point: If I used standard c++ cin/cout, is that OS dependent, get compiled to direct assembly I/O, or does the answer depend on the compiler?)

Baruch
  • 20,590
  • 28
  • 126
  • 201

7 Answers7

8

Different operating systems support different binary formats (e.g. ELF vs COFF), different dynamic linkers (with *.so, *.dll, and *.dylib files being linked at runtime, after you've distributed your binary), and provide different sets of functions and libraries for using OS-provided functionality.

Different sets of function can be addressed by, for example, the Single UNIX Specification / IEEE Std. 1003.1 (POSIX), which dictates a single set of functions to be provided across all operating systems for various operating system tasks (unfortunately not all OSs -- ahem, Windows -- comply). With regard to binary formats (and also CPU instruction-set architecture), one way to deal with this is to distribute some higher level binary format (bytecode), and then do a just-in-time transformation to the target instruction-set and binary format (although this is more about changing when you do it... it still needs to be done). The low-level virtual machine (LLVM), for example, provides for such a transformation.

Michael Aaron Safyan
  • 93,612
  • 16
  • 138
  • 200
8

It comes down to the operating system's API and ABI.

Different operating systems provide different system calls, as well as different mechanisms to invoke those system calls. For example, while POSIX provides fork and execv to create a new process, Windows provides CreateProcess.

Furthermore, there are differences at the assembly level. What assembly code do you use to call a function? Different operating systems expect different calling conventions. Operating systems also do not necessarily agree on the formatting of the executable binary, nor do they agree on other mechanisms such as dynamic linking.

Another point to consider is concurrency and how the OS handles that. Some operating systems recognize threads at the kernel level, while others do not. Some might just prefer using multiple processes, and some might use a completely different model. The APIs are different, and the abstractions might be different. For example, one OS might use locks and semaphores, another might use message passing.

Aaron Klotz
  • 11,287
  • 1
  • 28
  • 22
6

Because, for one thing, the ability to interface to the operating system is not consistent between platforms. Even between Linux/x86, Windows and Mac/Intel (which may all use the same CPU), the way of doing things may be vastly different.

So while a compiler may produce object files that would work, the minute you link those objects to platform-specific libraries, they become inherently non-portable.

One example is memory allocation. When you want to request more memory from the OS under UNIX, you may use the brk or sbrk library function. This is not part of the C standard library, more a UNIX-specific one.

On the other hand, Windows may provide a Win32GetMem function to do the same thing.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
1

You're right in that a compiler would only be concerned with generating the correct assembler instructions for a specific target CPU. But the compiler doesn't stand alone -- an application normally has to interface with the host operating system in order to run properly.

So the problem is not with the compiler itself, but the set of standard libraries that each OS provides to do common stuff like accessing files, allocating memory, or interact with the graphical window system. While code for, say, Windows XP and Solaris x86 would be compiled to the same set of machine code instructions, the code would have to make different calls to interface with the OS.

Proprietary compilers come bundled only with headers and libraries for the OS they are made for. Other, more agnostic compilers like the GNU GCC do share a lot of code for compiling to the same CPU type across different operating systems.

Kim Burgaard
  • 3,508
  • 18
  • 11
0

Your question, "So the same machine code should run on every OS, as long as it is the same architecture, no?" is incorrect. Machine code runs on hardware not on operating system. An OS provides services to the user/system program and these services are implemented differently in each OS. Say for argument sake, if you were to take the machine code of your program from OS "X" on arch "A" and feed it directly to a system with OS "Y" with same arch "A", the cpu will be able to execute the instructions but this may (and will almost always) result in crash of your program (because of the different implementation issues as others have already mentioned).

blue_whale
  • 335
  • 3
  • 6
0

Yes Machine code is run directly by the CPU. But even the 32 bit "x86" instruction set which is the machine code has gone through revisions over the years. But in general code compiled for an architecture should run on other systems. A bigger issue would be what compiler and OS it was compiled in

Matt Phillips
  • 11,249
  • 10
  • 46
  • 71
0

Yes, it does. That's how you can run Windows applications in Linux under Wine, for example. You can run the program directly for more then one reason. For example format of your executable is different in different systems, so systems normally have no slightest idea how to load and execute each other executables. Besides most of the programs will want to call system routines and call some library functions, and here we are talking about completely different sets of agreements on how it's done in each system.

cababunga
  • 3,090
  • 15
  • 23