1

I'm having issues with using r2pipe, Radare2's API, with the multiprocessing Pool.map function in python. The problem I am facing is the application hangs on pool.join().

My hope was to use multithreading via the multiprocessing.dummy class in order to evaluate functions quickly through r2pipe. I have tried passing my r2pipe object as a namespace using the Manager class. I have attempted using events as well, but none of these seem to work.

class Test:
    def __init__(self, filename=None):
        if filename:
            self.r2 = r2pipe.open(filename)
        else:
            self.r2 = r2pipe.open()
        self.r2.cmd('aaa')

    def t_func(self, args):
        f = args[0]
        r2_ns = args[1]
        print('afbj @ {}'.format(f['name']))
        try:
            bb = r2_ns.cmdj('afbj @ {}'.format(f['name']))
            if bb:
                return bb[0]['addr']
            else:
                return None
        except Exception as e:
            print(e)
            return None

    def thread(self):
        funcs = self.r2.cmdj('aflj')
        mgr = ThreadMgr()
        ns = mgr.Namespace()
        ns.r2 = self.r2
        pool = ThreadPool(2)
        results = pool.map(self.t_func, product(funcs, [ns.r2]))
        pool.close()
        pool.join()
        print(list(results))

This is the class I am using. I make a call to the Test.thread function in my main function.

I expect the application to print out the command it is about to run in r2pipe afbj @ entry0, etc. Then to print out the list of results containing the first basic block address [40000, 50000, ...].

The application does print out the command about to run, but then hangs before printing out the results.

1 Answers1

0

ENVIRONMENT

  • radare2: radare2 4.2.0-git 23712 @ linux-x86-64 git.4.1.1-97-g5a48a4017 commit: 5a48a401787c0eab31ecfb48bebf7cdfccb66e9b build: 2020-01-09__21:44:51
  • r2pipe: 1.4.2
  • python: Python 3.6.9 (default, Nov 7 2019, 10:44:02)
  • system: Ubuntu 18.04.3 LTS

SOLUTION

  • This may be due to passing the same instance of r2pipe.open() to every call of t_func in the pool. One solution is to move the following lines of code into t_func:
r2 = r2pipe.open('filename')
r2.cmd('aaa')
  • This works, however its terribly slow to reanalyze for each thread/process.
  • Also, it is often faster to allow radare2 to do as much of the work as possible and limit the number of commands we need to send using r2pipe.
    • This problem is solved by using the command: afbj @@f
    • afbj # List basic blocks of given function and show results in json
    • @@f # Execute the command for each function

EXAMPLE

Longer Example

import r2pipe 

R2: r2pipe.open_sync = r2pipe.open('/bin/ls')   
R2.cmd("aaaa")         
FUNCS: list = R2.cmd('afbj @@f').split("\n")[:-1]   
RESULTS: list = []

for func in FUNCS:
    basic_block_info: list = eval(func)
    first_block: dict = basic_block_info[0]
    address_first_block: int = first_block['addr']
    RESULTS.append(hex(address_first_block))

print(RESULTS)  

'''
['0x4a56', '0x1636c', '0x3758', '0x15690', '0x15420', '0x154f0', '0x15420',
 '0x154f0', '0x3780', '0x3790', '0x37a0', '0x37b0', '0x37c0', '0x37d0', '0x0',
 ...,
'0x3e90', '0x6210', '0x62f0', '0x8f60', '0x99e0', '0xa860', '0xc640', '0x3e70',
 '0xd200', '0xd220', '0x133a0', '0x14480', '0x144e0', '0x145e0', '0x14840', '0x15cf0']
'''

Shorter Example

import r2pipe

R2 = r2pipe.open('/bin/ls')                         
R2.cmd("aaaa")
print([hex(eval(func)[0]['addr']) for func in R2.cmd('afbj @@f').split("\n")[:-1]])
Kuma
  • 427
  • 5
  • 17
  • Also consider checking out https://reverseengineering.stackexchange.com/ for reverse engineering questions! – Kuma Jan 17 '20 at 14:22