1

I am using pycparser to parse C file. I wish to get start and end of each function definition in C file. But what I actually get is only start of function definitions.

memmgr_init at examples/c_files/memmgr.c:46
get_mem_from_pool at examples/c_files/memmgr.c:55

I wish to get something like:

memmgr_init at examples/c_files/memmgr.c: start :46 end : 52

class FuncDefVisitor(c_ast.NodeVisitor):

def visit_FuncDef(self, node):
print('%s at %s' % (node.decl.name, node.decl.coord))
Chris Martin
  • 30,334
  • 10
  • 78
  • 137

2 Answers2

1

You can't do this with pycparser because it doesn't record the end position of functions when it is parsing.

You can regenerate the function body from the AST:

from pycparser import c_parser, c_ast, parse_file, c_generator

class FuncDefVisitor(c_ast.NodeVisitor):
def __init__(self, bodies):
    self.bodies = bodies
    self.generator = c_generator.CGenerator()
def visit_FuncDef(self, node):
    self.bodies.append(self.generator.visit(node))

def show_func_defs(filename):
    ast = parse_file(filename, use_cpp=True,
                 cpp_args=r'-Iutils/fake_libc_include')
    bodies = []
    v = FuncDefVisitor(bodies)
    v.visit(ast)
    for body in bodies:
        print(body)

But this may have slightly different formatting from the original and so cannot be used to work out how many lines later the end of the function is from the beginning.

  • 1
    Thanks a lot Martin ! Should I write my own parsing mechanism ? I thought of implementing stack for detecting matching closing brace "}" depicting end of function. Or is there any pretty and easier way of doing it. Thanks – Saurabh Singh Mar 23 '16 at 05:47
0

I have a quick and dirty solution to your problem. What you need to do is get the closest line from the AST. I don't like modifying libraries unless I have to. I assume you are familiar with parsing and data manipulation. If not, I can add more details.The parser.parse method generates a AST class object. gcc_or_cpp_output is some intermediate code generated by gcc or cpp.

ast = parser.parse(gcc_or_cpp_output,filename)

AST's function has a show method and default arguments. You will need to set showcoord True for your problem.

ast.show(buf=fb,attrnames=True, nodenames=True, showcoord=True)

        buf:
            Open IO buffer into which the Node is printed.

        offset:
            Initial offset (amount of leading spaces)

        attrnames:
            True if you want to see the attribute names in
            name=value pairs. False to only see the values.

        nodenames:
            True if you want to see the actual node names
            within their parents.

        showcoord:
            Do you want the coordinates of each Node to be
            displayed

You will then need to change the buf default from sys.stdout to your own buffer class so you can capture the ast graph. You could also traverse the tree but I'll save a tree traverse solution for another day. I wrote a simple fake_buffer below.

class fake_buffer():
    def __init__(self):
        self.buffer =[]
    def write(self,string):
        self.buffer.append(string)
    def get_buffer(self):
        return self.buffer

So all you need to do now is save is pass your fake buffer to the ast.show() method to get the AST.

fb = fake_buffer()
ast.show(buf=fb,attrnames=True, nodenames=True, showcoord=True)

You will have your AST as a list at this point. The function Declarations will be near the bottom. Now you just need to parse out all the extra stuff and get the max coordinate in that function delectation.

  FuncCall <block_items[12]>:  (at ...blah_path_stuff.../year.c:48)

ABC Always Be Coding

Quentin Mayo
  • 390
  • 4
  • 11