0

I am using a musl-targeting cross compiler for arm, built with musl-cross-make (gcc 9.2.0, musl 1.2.0). When I compile a simple hello world c program with printf, I get undefined references for symbols that are in the standard library:

cross-musl/bin/arm-linux-musleabihf-gcc -c hello.c -o hello.o
cross-musl/bin/arm-linux-musleabihf-ld  hello.o -o hello.elf
cross-musl/bin/arm-linux-musleabihf-ld: warning: cannot find entry symbol _start; defaulting to 0000000000010074
cross-musl/bin/arm-linux-musleabihf-ld: hello.o: in function `main':
hello.c:(.text+0x18): undefined reference to `puts'

When I add libc.a and crt1.o to the linker command, I get no error:

cross-musl/bin/arm-linux-musleabihf-gcc -c hello.c -o hello.o
cross-musl/bin/arm-linux-musleabihf-ld -Lcross-musl/arm-linux-musleabihf/lib -lc cross-musl/arm-linux-musleabihf/lib/crt1.o hello.o -o hello.elf

I thought that it's not necessary to specify standard libraries and startup files when -nostartfiles, -nostdlib or -nodefaultlibs is not used, or am I wrong ?

ErwinP
  • 402
  • 3
  • 9
  • because the bootstrap is in the C library not part of the compiler (nor your code apparently). you still have to check the ouput and confirm the linker script matches your target and the code is boot/run-able. – old_timer Jun 02 '20 at 19:00
  • examine what a normal gcc hello.c -o hello call to the linker looks like. It has the paths/libraries in there. – old_timer Jun 02 '20 at 19:02
  • @old_timer: this toolchain is targeting linux, not a bare-metal target, so this should work, shouldn't it ? @ErwinP: I was able to reproduce your issue using `https://musl.cc/armel-linux-musleabihf-cross.tgz` - you are not alone. – Frant Jun 02 '20 at 19:14
  • the toolchain doesnt know the difference does it? its just a dumb compiler and linker. examine the linker call when your compiler does the linking. – old_timer Jun 02 '20 at 19:17
  • lol, for x86 linux the linker script call is masive cant fit it here, but things like x86_64-linux-gnu/crti.o and -L/lib/x86_64-linux-gnu and -lc and -lgcc kinda jump out there – old_timer Jun 02 '20 at 19:21
  • @old_timer: I would say this does depend how the toolchain was built, but I would not bet my life on it. It works perfectly fine using `gcc/glibc`: `/opt/arm/9/gcc-arm-9.2-2019.12-x86_64-arm-none-linux-gnueabihf/bin/arm-none-linux-gnueabihf-gcc -o hello.o hello.c /opt/arm/9/gcc-arm-9.2-2019.12-x86_64-arm-none-linux-gnueabihf/bin/arm-none-linux-gnueabihf-ld -o hello.elf hello.o file hello.elf hello.elf: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, for GNU/Linux 3.2.0, not stripped` – Frant Jun 02 '20 at 19:23
  • same for a cross compiled linux arm compiler. it is expected that ld is passed all of these things, it is a pretty dumb tool. the compiler (wrapper, gcc) has the most smarts. – old_timer Jun 02 '20 at 19:31
  • @Frant yes because it is passing all those parameters to the linker, being glibc and part of a toolchain package plus libraries, binutils, compiler, c library. Gcc knows where everything is relative to itself. ld doesnt know where anything is it has to be told everything. So either you build gcc to know what library and where it is if possible, or you do the normal thing of passing all the relevant information to ld as gcc would...which is like 10-20 things normally maybe more (of which you dont need all of them) – old_timer Jun 02 '20 at 19:33
  • just look at the call to the linker in those cases you will see what I am saying, there wont be any exceptions. – old_timer Jun 02 '20 at 19:34
  • What I was meaning is that the exact same commands, just replacing `arm-linux-musleabihf-gcc` by `arm-none-linux-gnueabihf-gcc `, executed on an `x86_64` Linux system, are working. I don' t disagree with what you are saying. – Frant Jun 02 '20 at 19:35

2 Answers2

1

This is how the GNU tools work gcc has the most smarts when it is used to call not just the compiler but to call the linker it knows where everything is relative to it, binutils and the C library, this is why you see what you see when you build a GNU based toolchain. It is trivial to examine the call from gcc to the linker and see that it specifies everything. ld has no clue where it is nor any way to figure out where anything is it has to be told everything. This is how ld was designed. A simple example of a Linux program and a cross compiler.

#include <stdlib.h>
int main ( void )
{
    exit(1);
}

arm-linux-gnueabi-gcc so.c -o so.o

and this is what is passed to ld (the actual binary is named ld when called from arm-whatever-gcc) to make linking work.

[1][-plugin]
[2][/usr/lib/gcc-cross/arm-linux-gnueabi/5/liblto_plugin.so]
[3][-plugin-opt=/usr/lib/gcc-cross/arm-linux-gnueabi/5/lto-wrapper]
[4][-plugin-opt=-fresolution=/tmp/ccaUZvi4.res]
[5][-plugin-opt=-pass-through=-lgcc]
[6][-plugin-opt=-pass-through=-lgcc_s]
[7][-plugin-opt=-pass-through=-lc]
[8][-plugin-opt=-pass-through=-lgcc]
[9][-plugin-opt=-pass-through=-lgcc_s]
[10][--sysroot=/]
[11][--build-id]
[12][--eh-frame-hdr]
[13][-dynamic-linker]
[14][/lib/ld-linux.so.3]
[15][-X]
[16][--hash-style=gnu]
[17][--as-needed]
[18][-m]
[19][armelf_linux_eabi]
[20][-z]
[21][relro]
[22][-o]
[23][so.o]
[24][crt1.o]
[25][crti.o]
[26][/usr/lib/gcc-cross/arm-linux-gnueabi/5/crtbegin.o]
[27][-L/usr/lib/gcc-cross/arm-linux-gnueabi/5]
[28][-L/usr/lib/gcc-cross/arm-linux-gnueabi/5/../../../../arm-linux-gnueabi/lib/../lib]
[29][-L/lib/../lib]
[30][-L/usr/lib/../lib]
[31][-L/usr/lib/gcc-cross/arm-linux-gnueabi/5/../../../../arm-linux-gnueabi/lib]
[32][/tmp/ccmlFvlr.o]
[33][-lgcc]
[34][--as-needed]
[35][-lgcc_s]
[36][--no-as-needed]
[37][-lc]
[38][-lgcc]
[39][--as-needed]
[40][-lgcc_s]
[41][--no-as-needed]
[42][/usr/lib/gcc-cross/arm-linux-gnueabi/5/crtend.o]
[43][crtn.o]

If you want to link yourself this has always been the case that you have to point at everything. An exception being the default linker script, which since you want to replace your C library you absolutely have to replace that too.

This or you build into gcc the knowledge/mechanism to have it pass the new library information to the linker rather than the GNU C library information/paths...

Another example, x86 native:

gcc -O2 so.c -o so
[1][-plugin]
[2][/usr/lib/gcc/x86_64-linux-gnu/5/liblto_plugin.so]
[3][-plugin-opt=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper]
[4][-plugin-opt=-fresolution=/tmp/ccmhqpU9.res]
[5][-plugin-opt=-pass-through=-lgcc]
[6][-plugin-opt=-pass-through=-lgcc_s]
[7][-plugin-opt=-pass-through=-lc]
[8][-plugin-opt=-pass-through=-lgcc]
[9][-plugin-opt=-pass-through=-lgcc_s]
[10][--sysroot=/]
[11][--build-id]
[12][--eh-frame-hdr]
[13][-m]
[14][elf_x86_64]
[15][--hash-style=gnu]
[16][--as-needed]
[17][-dynamic-linker]
[18][/lib64/ld-linux-x86-64.so.2]
[19][-z]
[20][relro]
[21][-o]
[22][so]
[23][/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o]
[24][/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o]
[25][/usr/lib/gcc/x86_64-linux-gnu/5/crtbegin.o]
[26][-L/usr/lib/gcc/x86_64-linux-gnu/5]
[27][-L/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu]
[28][-L/usr/lib/gcc/x86_64-linux-gnu/5/../../../../lib]
[29][-L/lib/x86_64-linux-gnu]
[30][-L/lib/../lib]
[31][-L/usr/lib/x86_64-linux-gnu]
[32][-L/usr/lib/../lib]
[33][-L/usr/lib/gcc/x86_64-linux-gnu/5/../../..]
[34][/tmp/ccg1E03e.o]
[35][-lgcc]
[36][--as-needed]
[37][-lgcc_s]
[38][--no-as-needed]
[39][-lc]
[40][-lgcc]
[41][--as-needed]
[42][-lgcc_s]
[43][--no-as-needed]
[44][/usr/lib/gcc/x86_64-linux-gnu/5/crtend.o]
[45][/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crtn.o]

Same deal, ld has no knowledge of where anything is, the C library, its bootstrap, gcc lib, etc...this was all boiled into gcc, probably by design. (gcc of course being a wrapper that calls multiple programs in order to compile a parser and some other then the actual compiler, then the assembler then the linker and perhaps others as well, as just assembles, ld just links).

A lot of folks get familiar with baremetal and thus not use a C library or other stuff, so the linker script is somewhat trivial (certainly not as trivial as ld file.o -o file.elf, unless you have a hand made environment for that target). When you want to add libraries either you teach gcc about it or you pass everything to the linker that it needs.

-lgcc simply means for example go look for libgcc in the -L specified path, without a -L specified path, no joy as ld doesn't know about things outside binutils and probably looks locally.

binutils is a set of utils. gcc is a higher level than that and relies on binutils (or a replacement). gcc can't live without binutils, bintuils certainly lives without gcc, this and the Unix way, clearly gcc passes everything outside binutils to binutils in order for it to be a binary utility for gcc.

halfer
  • 19,824
  • 17
  • 99
  • 186
old_timer
  • 69,149
  • 8
  • 89
  • 168
0

@old_timer Thanks, you are right. When I start gcc instead of ld to generate the executable, I have no unresolved externals:

arm-linux-musleabihf-gcc -c hello.c -o hello.o
arm-linux-musleabihf-gcc -o hello.elf  hello.o
ErwinP
  • 402
  • 3
  • 9