2

I am currently building a Pin tool which detects uninitialized reads from Linux application, based on this blog post.
You can also see the author's code from the blog.

Since this one is for Windows, I tried to create a Linux-compatible one. But when I execute my Pin tool with application, a segmentation fault occurs. The weird one is that the fault occurs when a function is called(the fault occurs when the pin tool is calling the function taint_get which is in the taint_define function), not because of access of uninitialized heap pointer or such points of general segmentation fault.

The point of the segmentation fault looks like this:

VOID Instruction(INS ins, VOID *v)
{
   Uninit_Instruction(ins, v);
}

void Uninit_Instruction(INS ins, void* v)
{
   // check if the stack pointer is altered (i.e. memory is allocated on the
   // stack by subtracting an immediate from the stack pointer)
   if(INS_Opcode(ins) == XED_ICLASS_SUB &&
      INS_OperandReg(ins, 0) == REG_STACK_PTR &&
      INS_OperandIsImmediate(ins, 1)) 
   {
      // insert call after, so we can pass the stack pointer directly
      INS_InsertCall(ins, IPOINT_AFTER, (AFUNPTR)taint_undefined,
             IARG_REG_VALUE, 
             REG_STACK_PTR, 
             IARG_ADDRINT, (UINT32) INS_OperandImmediate(ins, 1),
             IARG_END);
   }

   UINT32 memOperands = INS_MemoryOperandCount(ins);

   for (UINT32 memOp = 0; memOp < memOperands; memOp++)
   {
      if (INS_MemoryOperandIsRead(ins, memOp))
      {
     INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)taint_check,
            IARG_INST_PTR,
            IARG_MEMORYOP_EA, memOp,
            IARG_MEMORYREAD_SIZE,
            IARG_END);
      }

      if (INS_MemoryOperandIsWritten(ins, memOp))
      {
     INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)taint_define,
            IARG_MEMORYOP_EA, memOp,
            IARG_MEMORYWRITE_SIZE,
            IARG_END);
      }
   }

}

The callback functions look like these:

// Taint this address as written
void taint_define(ADDRINT addr, UINT32 size)
{
   // Debug purpose
   TraceFile << "taint_define: " << addr << ", " << size << endl;

   // taint the addresses as defined, pretty slow, but easiest to implement
   for (UINT32 i = 0; i < size; i++) 
   {
      //TraceFile << "taint_define_loop size: " << size << endl;
      UINT32 *t = taint_get(addr + i);
      TraceFile << "after taint_get" << endl;
      UINT32 index = (addr + i) % 0x20000;

      // define this bit
      t[index / 32] |= 1 << (index % 32);
   }
}


inline UINT32* taint_get(ADDRINT addr)
{
   // Debug purpose
   TraceFile << "taint_get: " << addr;

   // allocate memory to taint these memory pages
   if(taint[addr / 0x20000] == NULL) {
      // we need an 16kb page to track 128k of memory
      /*
        taint[addr / 0x20000] = (UINT32 *) W::VirtualAlloc(NULL, 0x20000 / 8,
    MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
      */
      taint[addr / 0x20000] = (UINT32*)malloc(0x20000/8);
   }

   return taint[addr / 0x20000];
}

The output looks like this:

C:Tool (or Pin) caused signal 11 at PC 0x7fcf475e08a4
segmentation fault (core dumped)

and the log is here.

Watched Image count: 0x1
WatchedImage: unread_3vars
Uninit_Image
Uninit_Image
Thread start
taint_define: 0x7fff06930d58, 0x8

I'm currently working on Fedora core 17 x86-64, gcc 4.7.2, and Pin 2.12-58423.
And, my pin tool code is attached here

LocustSpectre
  • 87
  • 2
  • 8
  • You should compile with all warnings on `-Wall -Wextra - pedantic` and then fix the code until no more warings are issued by the compiler. Then compile with debugging symbol using option `-g` and run the program using a debugger like for example gdb. It then shows you exactly where the program crashed and allows you to inspect the variables in use. – alk Nov 10 '13 at 11:15

2 Answers2

0

I am currently building a Pin tool which detects uninitialized reads from Linux application, based on this blog post.

This doesn't really answer your question, and you may have other reasons to learn Pin tool, but ...

We've found Pin-based tools inadequate for instrumenting non-toy programs. IF your goal is to detect uninitialized memory reads, consider using Memory Sanitizer.

Employed Russian
  • 199,314
  • 34
  • 295
  • 362
0

readb4write is 32 bit only. I don't know how are you are compiling it but even if you add -m32 it might still not work. This is what happened in my case but i am running it on Windows.
You can tell it is 32 bit only by looking for example at the comment: "// we use 0x8000 chunks of 128k to taint"
0x8000 x 128kb = 4294967296 which is the virtual range limit of 32 bit process.
On x64 you would need to cater for 48 bit addresses in taint_get method. This is still a naive implementation but so is everything else

typedef UINT64 * TTaint[0x80000];
TTaint *taintTable[0x10000] = { 0 };


inline UINT64 *taint_get(ADDRINT addr)
{
   UINT64 chunkAddress = addr / 0x20000; //get number address of 128kb chunk. 

   UINT64 firstLevAddr = chunkAddress / 0x10000;
   UINT64 secondLevelAddr = chunkAddress % 0x10000;

   TTaint *taint = NULL;
   if (taintTable[firstLevAddr] == NULL){
       taintTable[firstLevAddr] = (TTaint*)W::VirtualAlloc(NULL, sizeof(TTaint),
        MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
    }
   taint = taintTable[firstLevAddr];

   // allocate memory to taint these memory pages
   if ((*taint)[secondLevelAddr ] == NULL) {
      // we need an 16kb page to track 128k of memory
        (*taint)[secondLevelAddr] = (UINT64 *)W::VirtualAlloc(NULL, 0x20000 / 8,
            MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
   }
   return (*taint)[secondLevelAddr];
}

Also most (if not all ) variables need to be UINT64 instead of UINT32. And 32 need to be changed to 64.

There is another problem i have not solved yet. There is a line that detects if the instruction accessing uninitialized memory belongs to the program being checked. It is unlikely that it is still valid in x64:
(ip & 0xfff00000) == 0x00400000)
I will publish the code in github if i manage to get it working.

Jacek Tomaka
  • 422
  • 7
  • 15