I am reading the dynamic symbol table .dynsym
and checking the DT_HASH and DT_GNU_HASH entries.
Oddly, I found out that the values are different. I know that DT_GNU_HASH can be a bit complicated to implement however to start I am simply using musl implementation found here.
/**
* This is the code from Musl's dynamic linker.
*/
size_t gnu_hash_symtab_len_musl(const ElfW(Word) * base_address) {
uint32_t nsym;
const uint32_t *buckets;
const uint32_t *hashval;
buckets = reinterpret_cast<const uint32_t *>(
base_address + 4 + (base_address[2] * sizeof(size_t) / 4));
for (size_t i = nsym = 0; i < base_address[0]; i++) {
if (buckets[i] > nsym) nsym = buckets[i];
}
if (nsym) {
nsym -= base_address[1];
hashval = buckets + base_address[0] + nsym;
do nsym++;
while (!(*hashval++ & 1));
}
return nsym;
}
Here is my DT_HASH implementation
case (DT_HASH): {
// https://flapenguin.me/elf-dt-hash
struct hash_header {
uint32_t nbucket;
uint32_t nchain;
};
const hash_header *header =
reinterpret_cast<const hash_header *>(base_address);
sym_cnt_hash = header->nchain;
break;
}
For my shared libraries I am seeing two different values... For now it's not a big deal because I am taking the max but I can't understand why.
EDIT I found that adding the symoffset in the DT_GNU_HASH makes them the same. Those are symbols though are valid in my shared library and not just "STD_UNDEF" why are they in the offset?