0

I work in analyzing binary files, using Python. I have been using debuggers to do a dynamic analysis (i.e running the application and using breakpoints to get runtime execution). however, results can be improved if i can use some binary instrumentation fremework like PIN. The PIN is developed in C++ and provided as closed source (only dlls). We write something called PinTools do describe where and what we want to intercepts. I want to port PIN functionality into Python so that i continue using Python. I am aware of "ctypes" and boost-python.

My problem is: in order to use PIN, we write a pintool and run our bibnary executable with Pin and pintool (it is like running application with JIT). Now, I have no idea if I can use ctypes etc. to import PIN functions and use this python code for dynamically analyzing the binary. Can you please provide some suggestions or guidelines on how to proceed with this task.

So, in nut-n-shell, I want to create a Python interface (wrapper) to PIN framework.

Sanjay
  • 95
  • 2
  • 14
  • Does PIN produce output files? Are you trying to read those output files? Is that what you mean by "binary"? Binary-format output files from PIN? – S.Lott Dec 03 '10 at 12:05
  • @S.Lott: I think 'binary' here means compiled executables. – Thomas K Dec 03 '10 at 12:15
  • Yes, binary means compiled executables (e.g PE, ELF files). For example, let us say i have a executable abc.exe. if I want to analyze it dynamically, I will run it like this: C:\>pin --my_pintool abc.exe. Here pintool is a file that uses C++ language and calls functions defined in PIN dll and does instrumentation on-the-fly. on abc.exe at runtime. – Sanjay Dec 03 '10 at 13:14
  • In principle, ctypes lets you call functions from a dll. But I imagine it would need a fairly detailed knowledge of Pin, and their mailing list is probably the best place to ask about that. – Thomas K Dec 03 '10 at 13:37
  • @Thomas: Thank you. Yes, I am aware of ctypes. I also posted a mail in PIN list, but could not get something very concrete. But, PIN has got a good documentation available to know about functions. – Sanjay Dec 05 '10 at 09:12

2 Answers2

3

Check out the ProcessTap project. Appears to implement exactly what you are looking for: http://code.google.com/p/processtap/

0

I was thinking about this recently, while I haven't looked into it, I would approach the problem like this: write a pintool that, upon initialization, starts an embedded python interpreter and imports a python module. I'd look at using SWIG to generate bindings for all the PIN api calls you want to use. Then the pintool would call a hardcoded function in the imported python module that would issue calls to the api to register more functions and do whatever you want to do.

I'm not sure how the callbacks would work, I don't know enough about SWIG. Also, this may fail if the program you're trying to instrument itself uses Python. But that's how I'd try to solve this problem to start out.

brownan
  • 48
  • 4
  • Thank you Brownan. This is what I was thinking when I posted this question so that i can get some opinion to start with. I shall definitely look into the way you have suggested. Once we start working on this, we'll come to know how to proceed further. There are other options like cython, boost-python that could be used for C-to-python and python-to-C. Thank you once again. – Sanjay Dec 05 '10 at 09:03