13

Lately I'm diving into optimizing my C++ code and as such started to play around with the compiler explorer. Since i am mainly developing on windows with Visual Studio i used the msvc compiler.

At some point msvc got out of hand. after some fiddling around i could narrow it down to the iostream header, that's supposed to be preferred for I/O (SL.io.3).

#include <iostream>
int main() {
    std::cout << "Hello World!\n";
    return 0;
}

While gcc or clang's total output (main + a static initializer that calls some ios_base init functions) totals about 20 lines of assembly (after the Godbolt compiler explorer filters out directives and comments).
MSVC explodes it into 4000. Most of those lines are separate functions; MSVC's definition of main itself is 7 instructions vs. 8 for gcc/clang. (gcc/clang using GNU/Linux libstdc++ pass an extra length arg to the cout operator overload function, not just 2 pointers like MSVC does when using its own C++ library.)

If i use something like puts instead, MSVC's total output is reasonably compact and comparable to gcc/clang, like here.

Can someone kindly explain to me what is happening here, what im doing wrong or point me in the right direction?

Why are MSVC asm listings so bloated for simple functions using C++ libraries?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Flowly
  • 215
  • 1
  • 2
  • 8
  • 2
    You aren't doing anything wrong – M.M May 18 '20 at 21:31
  • 1
    you will find that in all compilers' case, that code ends up in your executable anyway by the time linking is complete – M.M May 18 '20 at 21:32
  • 4
    It seems MSVC just includes all the standard library templates instantiations - e.g. you can find basic_string(const char*) ctor there and so on. Probably the generated assembly listing is intended to be self-contained to be compilable by the external assembler. The actual `main` function is between `main PROC` and `main ENDP` lines. – dewaffled May 18 '20 at 21:42
  • 8
    Do note that Compiler Explorer reduces the assembly output to try to show only what's relevant. It's possible that it's getting rid of GCC/Clang stuff and missing the same for MSVC considering it wasn't originally built for MSVC. Also note that each compiler might include more or less debug stuff by default. For example, my Visual Studio gives me a 49 KB executable for a debug build vs. an 11 KB executable for a release build. – chris May 18 '20 at 21:52
  • 3
    @dewaffled: Yes, MSVC emits definitions for library templates *including ones it doesn't call*. But **no, that's not to make it self-contained**. In fact the opposite; g++ literally does work by feeding its asm output to a separate assembler (`as` or `gas`). The functions that are called but not defined are in `libstdc++`. MSVC's asm output mode contains extra stuff that you have to remove if you want to actually assemble it into an object file you can link into a working executable. AFAIK, most of that extra asm wouldn't appear as function definitions in a `.obj` if you actually compiled. – Peter Cordes May 18 '20 at 22:00
  • @chris: good point about Godbolt filtering. I updated the question to point that out, along with other key details, to head off more wild guessing about how stuff works. An answer from an MSVC expert would be good, to maybe shed some light on which template functions MSVC chooses to emit definitions for in an asm listing, and why, – Peter Cordes May 18 '20 at 22:11
  • What are the optimization settings? Are you building in Debug or Release mode? – Thomas Matthews May 18 '20 at 23:30
  • 1
    @ThomasMatthews you can see all the used compiler flags in the provided compiler explorer links. My question is specific to the generated output there and the used compilers. Im aware about the fact though, that a debug build would contain a lot of debug assembly, when i would look at the disassembly in visual studio. – Flowly May 19 '20 at 07:49

1 Answers1

6

This may not be a complete answer, but I think I can explain much of the differences.

Much of the standard library (e.g., iostreams) is template heavy code. I believe the Microsoft compiler generates more template instantiations and relies on the linker to remove unnecessary ones. I think that's an effect of the different strategies the Windows linker uses versus most Posix ones, but it might also be a result of simply using a different standard library implementation.

If you specify /MD, which tells the compiler you intend to use the DLL version of the standard library, the generated code drops from 4000+ lines to fewer than 500 lines. I don't know precisely why that's the case. Perhaps MSVC knows the DLL library has all the necessary template instantiations while the static library depends on template instantiation from the compiler.

You can elicit an incremental improvement by handling only C++ exceptions (with /EHs). By default, the compiler will generate code that handles asynchronous system exceptions as well. And while your hello-world sample doesn't explicitly use exceptions, parts of the standard library probably do. At this point, it looks like a lot of the additional lines are setting up stack unwinding tables and calling destructors.

A lot of the remaining excess in the MSVC version looks like it exists to unwind the stack while calling destructors, so the exception handling model may be different.

I thought Compiler Explorer had a "clang-cl" option in the past but I don't see it now. clang-cl, generally speaking, is a command driver that interprets cl.exe options and adjusts default options to make clang produce code that's binary ABI compatible with Microsoft code. It would be interesting to see if it generates code like regular clang or whether it ends up emitting code more like MSVC.

Adrian McCarthy
  • 45,555
  • 16
  • 123
  • 175
  • 1
    Are you sure all those function definitions that appear in the asm listing output would actually be present in a `.obj` object file? Godbolt doesn't allow the "compile to binary" option for MSVC, and it doesn't stop at `.obj` files anyway. I've read that MSVC's asm listing output is informational and isn't exactly what it puts in object files when compiling. So IDK how much of the explanation is excess verbosity purely in making asm listings, vs. how much is extra functions that the linker will remove. – Peter Cordes May 19 '20 at 00:22
  • 1
    @PeterCordes: Yes, all those function definitions and vtables exist in the .obj file. I built it locally and then used dumpbin on the .obj with /DISASM and /SYMBOLS. Each function in its own COMDAT. The symbols that should come from the standard library are generally marked "pick any". Comdats of generated debug info are marked to be picked if their associated comdat is picked. Comdats of generated vtables look like they're all "pick largest". A lot of the generated symbols appear related to exception handling/unwinding/RTTI. – Adrian McCarthy May 19 '20 at 18:15
  • Another incremental improvement can be had by adding /Zc:inline to the compilation arguments, which tells the compiler not to emit symbol information for data or functions that have internal linkage only. – Chris Kline Jan 27 '23 at 14:57