1

While instrumenting some code with Intel Pin I got a strange behavior that I isolated in the code below.

Basically, in the trace of my Pin Tool I noticed that some parts were missing and found that a call to PIN_InitSymbolsAlt(SYMBOL_INFO_MODE(UINT32(IFUNC_SYMBOLS) | UINT32(DEBUG_OR_EXPORT_SYMBOLS))) instead of my previous PIN_InitSymbols was changing much more than symbols being accessible or not — effectively, some routines that were analyzed with zero instructions before were now displaying a positive (and realistic) number of instructions...

In the code below (stripped down version of my program and Pin Tool), I took memcpy as example.

This Pin Tool is simply counting the number of instructions per routine:

#include "pin.H"

#include <cstdint>
#include <iostream>
#include <map>

std::map<std::string, uint64_t> routines;

VOID CountPtr(UINT64 *counter)
{
    ++(*counter);
}

VOID Routine(RTN rtn, VOID*)
{
    uint64_t& insn = routines[RTN_Name(rtn)];

    RTN_Open(rtn);
    for (INS ins = RTN_InsHead(rtn); INS_Valid(ins); ins = INS_Next(ins))
    {
        INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)CountPtr, IARG_PTR, &insn, IARG_END);
    }
    RTN_Close(rtn);
}

void Fini(INT32, void*)
{
    for (const auto& p : routines)
    {
        std::cout << p.first << " " << p.second << std::endl;
    }
}

int main(int argc, char *argv[])
{
    PIN_Init(argc, argv);

    // !NOTE: memcpy does NOT get analyzed with only this call instead: PIN_InitSymbols();
    PIN_InitSymbolsAlt(SYMBOL_INFO_MODE(UINT32(IFUNC_SYMBOLS) | UINT32(DEBUG_OR_EXPORT_SYMBOLS)));

    RTN_AddInstrumentFunction(Routine, 0);
    PIN_AddFiniFunction(Fini, 0);

    PIN_StartProgram();
    return 0;
}

My test binary is just calling memcpy — again, to simplify things:

// copy.cc
#include <cstring>
#include <cstdint>

void copy(uint8_t* __restrict dst, uint8_t* __restrict src, std::size_t sz)
{
    memcpy(dst, src, sz);
}

// test.cc
#include <cstdint>

unsigned char a[128];
unsigned char b[128];

void copy(uint8_t* __restrict, uint8_t* __restrict, std::size_t);

int main()
{
    copy(b, a, 128);
    return 0;
}

When calling it, it shows:

pin -t ./obj-intel64/mytool.so -- /tmp/test | grep memcpy
__memcpy_chk 0
__wmemcpy_chk 0
memcpy 29
memcpy@plt 1
wmemcpy 0

And now, if I replace the call to PIN_InitSymbolsAlt by PIN_InitSymbols, I get...:

$ pin -t ./obj-intel64/mytool.so -- /tmp/test | grep memcpy
__wmemcpy_chk 0
memcpy 0
memcpy@plt 1
wmemcpy 0

I find this a bit dodgy to be honest. This feature is not greatly documented, and either I am overlooking something, or this has nothing to do with symbols... The analyze of the memcpy routine simply didn't return any instructions for some reason.

Any ideas on what happened here, and why? Thanks!

Hadi Brais
  • 22,259
  • 3
  • 54
  • 95
AntiClimacus
  • 1,380
  • 7
  • 22
  • What version of Pin did you use? What compiler and compiler options did you use to compile `memcpy`? – Hadi Brais Jan 23 '19 at 23:55
  • @HadiBrais g++ 7.3.0 and Pin 3.7 (latest release). The Pin tool has been compiled with the `makefile.rules` from Pin, the test binary has been compiled with `-O3 -g`. – AntiClimacus Jan 24 '19 at 17:25
  • Can you check with `PIN_InitSymbolsAlt` but without passing `IFUNC_SYMBOLS`? I think this will behave like `PIN_InitSymbols`. That flag make Pin read the ifunc symbols, which may be the reason for the difference. – Hadi Brais Jan 24 '19 at 17:43
  • @HadiBrais Indeed, without `IFUNC_SYMBOLS` I don't get the symbol, so this is the one changing things... still, to me, symbols were one thing, routines another. I guess it is related to the fact that RTN, like IMG and unlike INS, in Pin, are being loaded at load time, and not on the fly, but I must admit I am slightly scared now that I could miss part of my analysis! – AntiClimacus Jan 24 '19 at 21:48
  • 1
    I don't understand why you're saying that `PIN_InitSymbolsAlt` with `IFUNC_SYMBOLS` is more accuarate/reliable. Why would `memcpy` be called 29 times? It makes sense that `memcpy@plt` is called once because all calls to `memcpy` has to go indirectly throughput the PLT entry of `memcpy`. So I'm not sure what the `memcpy` symbol actually represents. `memcpy@plt` is the one that represents the function and correctly counts the number of invocations, not `memcpy`. – Hadi Brais Jan 24 '19 at 21:57
  • Try to dump all the symbols defined in the memcpy binary using `readelf -Ws` and see what each symbol means. – Hadi Brais Jan 24 '19 at 22:01
  • @HadiBrais I spent a substantial more time looking at it and you're right, `IFUNC_SYMBOLS` does not make it more reliable - effectively, as you said, the numbers don't make sense. I still don't really understand what is the concept of IFUNC in the Pin world - to be continued. Thanks for having looked at it - much appreciated. – AntiClimacus Jan 26 '19 at 19:43

0 Answers0