0

I'm dealing with an old proprietary shared library -- no source code.

The binary contains lots of symbols (thousands), among them an array of character strings I need (several hundred strings). I know, it is there, because strings(1) lists them all -- in sequence. Unfortunately, the header files accompanying the library do not declare this list...

How do I find out, which symbol in the library refers to the array? I'd then add my own declaration and be able to access it from my code...

Mikhail T.
  • 3,043
  • 3
  • 29
  • 46

1 Answers1

1

The easy way...

Try the find command in gdb.

For example, I'll guess that the strings are in the .rodata section of the shared library, so we'll use info target to find the address bounds of that section, and do our search in that range:

$ gdb a.out
(gdb) start
Temporary breakpoint 1 at 0x401136
Starting program: /scratch/a.out 
Temporary breakpoint 1, 0x0000000000401136 in main ()
(gdb) info target
    ...
    0x00007f5049f0c000 - 0x00007f5049f0c020 is .rodata in ./lib.so
    ...
(gdb) find 0x00007f5049f0c000,0x00007f5049f0c020,"test string"
0x7f5049f0c000 <foo1>
0x7f5049f0c014
2 patterns found.

That should work if your strings were declared as initialized char arrays, like

static const char foo1[] = "test string";

If they were declared as pointers, like

static const char *foo2 = "another test string";

it's going to be a bit harder. The string itself won't be labeled, and the symbol you're looking for will be a pointer to that string. You'll then need to use the find command again to search for that pointer value, which I'll guess is in the .data section.

(gdb) find 0x00007f5049f0c000,0x00007f5049f0c020,"another test string"
0x7f5049f0c00c
1 pattern found.
(gdb) info target
    ...
    0x00007f5049f0e018 - 0x00007f5049f0e028 is .data in ./foo.so
    ...
(gdb) find 0x00007f5049f0e018,0x00007f5049f0e028,0x7f5049f0c00c
0x7f5049f0e020 <foo2>
1 pattern found.

The learning experience...

Use strings --radix=x, and find your string. The offset within the file will be listed (in hex); e.g.:

$ strings --radix=x lib.so | grep "test string" 
5ad740 test string

Now you need to convert that file offset to a virtual address.

Use readelf -lW to list the segments of the library; you'll see something like this:

$ readelf -lW lib.so | grep -e Type -e LOAD
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x004dc0 0x004dc0 R   0x1000
  LOAD           0x005000 0x0000000000405000 0x0000000000405000 0x53a989 0x53a989 R E 0x1000
  LOAD           0x540000 0x0000000000940000 0x0000000000940000 0x28f7c2 0x28f7c2 R   0x1000
  LOAD           0x7cfc70 0x0000000000bd0c70 0x0000000000bd0c70 0x0014b0 0x00d518 RW  0x1000

You're interested in the Offset, VirtAddr, and FileSiz columns. Find the LOAD segment that contains your file offset X, where Offset <= X < Offset + FileSiz. In this example, the offset 0x5ad740 is in the third segment (starting at offset 0x540000). Convert your offset to a virtual address by subtracting the starting offset of the segment and adding the starting virtual address of the segment:

offset - starting offset + starting virtual address = virtual address
0x5ad740 - 0x540000 + 0x940000 = 0x9ad740

Now use nm -n to scan the symbol table in address order. If you're lucky, you'll find an exact match:

$ nm -n lib.so | grep 9ad740
00000000009ad740 R foo1

Otherwise, you'll need to look for the closest symbol with a lower address. That should be it, if the strings were declared as arrays.

If the strings were declared as pointers, those pointers will need dynamic relocations (we are looking at a shared library, right?) -- most likely R_xxx_RELATIVE relocations. Look for a RELATIVE relocation whose addend matches your string's virtual address:

$ readelf -rW lib.so
Relocation section '.rela.dyn' at offset 0xc00000 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000c00018  0000000000000008 R_X86_64_RELATIVE                      9ad740

That shows that your pointer is at 0xc00018. Using nm again, you can find the symbol for that virtual address:

$ nm -n lib.so | grep c00018
0000000000c00018 R foo2
Cary Coutant
  • 606
  • 3
  • 7
  • A very informative answer for someone with such a low number of points :) You should post more, thank you very much! – Mikhail T. Feb 06 '21 at 17:44