2

I am working on a python library that turns python source code in to a control flow graph (CFG). As an intermediate step, I have converted the source code into an xml representation.

For example, the following source code:

def bar(): # line 32
    a += 1 # line 33
    for x in [1,2]: # line 34
        print x # line 35
        if x%2: # line 36
            print x**2 # line 37
            break # line 38
        else: # line 39
            print x**3 # line 40
    else: # line 41
        print "wpp" # line 42
    print "done" # line 43

Is converted into the following xml:

<?xml version="1.0" ?>
<module>
:   0
:   <functiondef>
:   :   32
:   :   <augassign>
:   :   :   33
:   :   :   <name>33</name>
:   :   :   <add>add</add>
:   :   :   <num>33</num>
:   :   </augassign>
:   :   <for>
:   :   :   34
:   :   :   <print>
:   :   :   :   35
:   :   :   :   <name>35</name>
:   :   :   </print>
:   :   :   <if>
:   :   :   :   36
:   :   :   :   <binop>
:   :   :   :   :   36
:   :   :   :   :   <name>36</name>
:   :   :   :   :   <mod>mod</mod>
:   :   :   :   :   <num>36</num>
:   :   :   :   </binop>
:   :   :   :   <print>
:   :   :   :   :   37
:   :   :   :   :   <binop>
:   :   :   :   :   :   37
:   :   :   :   :   :   <name>37</name>
:   :   :   :   :   :   <pow>pow</pow>
:   :   :   :   :   :   <num>37</num>
:   :   :   :   :   </binop>
:   :   :   :   </print>
:   :   :   :   <break>38</break>
:   :   :   :   <else>
:   :   :   :   :   39
:   :   :   :   :   <print>
:   :   :   :   :   :   40
:   :   :   :   :   :   <binop>
:   :   :   :   :   :   :   40
:   :   :   :   :   :   :   <name>40</name>
:   :   :   :   :   :   :   <pow>pow</pow>
:   :   :   :   :   :   :   <num>40</num>
:   :   :   :   :   :   </binop>
:   :   :   :   :   </print>
:   :   :   :   </else>
:   :   :   </if>
:   :   :   <else>
:   :   :   :   41
:   :   :   :   <print>
:   :   :   :   :   42
:   :   :   :   :   <str>42</str>
:   :   :   :   </print>
:   :   :   </else>
:   :   </for>
:   :   <print>
:   :   :   43
:   :   :   <str>43</str>
:   :   </print>
:   </functiondef>
</module>

Now, what I need to do, is turn this xml into a CFG, represented as follows:

0 [32]
32 [33]
33 [34]
34 [35, 42]
35 [36]
36 [37, 40]
37 [38]
38 [43]
40 [34]
42 [43]
43 []

Now, I've handled most cases I've seen so far. However, I'm having some trouble connecting the correct ends of the for-loop back to the top/definition of the for loop (line 34)

I'd be super appreciative of anyone who can help me figure out how to figure out which nodes of the for loop connect back to the top of the for loop. In this example, there is only one such node, namely the one on line 40

inspectorG4dget
  • 110,290
  • 27
  • 149
  • 241

1 Answers1

0

You can use PIPI's pycfg to generate the text and a python program called cfg.py (included below) to create a nice graphical diagram.

As an example, I converted your code above to:

for x in [1,2]: 
    print(x) 
    if x%2: 
        print(x**2) 
        break 
    else: 
        print(x**3) 
else: 
    print("wpp") 
    print("done") 

And got this result:

strict digraph "" {
    node [label="\N"];
    0    [label="0: start"];
    1    [label="1: for: (True if [1, 2] else False)"];
    0 -> 1;
    2    [label="1: x = [1, 2].shift()"];
    1 -> 2;
    8    [label="0: stop"];
    1 -> 8;
    7    [label="7: print((x ** 3))"];
    7 -> 1;
    3    [label="2: print(x)"];
    2 -> 3;
    4    [label="3: if: (x % 2)"];
    3 -> 4;
    4 -> 7;
    5    [label="4: print((x ** 2))"];
    4 -> 5;
    6    [label="5: break"];
    5 -> 6;
    6 -> 8;
}

CFG For The Inspector's code

There is an article on how to do this here:

https://www.geeksforgeeks.org/draw-control-flow-graph-using-pycfg-python/

Let me know if you have any trouble running it. Here is my "go" script for reference:

python ./cfg.py ./target.py

cfg.py is available as part of the GeeksforGeeks article. Here it is in case the link eventually goes dead:

from pycfg.pycfg import PyCFG, CFGNode, slurp
import argparse
import tkinter as tk
from PIL import ImageTk, Image
  
if __name__ == '__main__':
    parser = argparse.ArgumentParser()
  
    parser.add_argument('pythonfile', help ='The python file to be analyzed')
    args = parser.parse_args()
    arcs = []
  
    cfg = PyCFG()
    cfg.gen_cfg(slurp(args.pythonfile).strip())
    g = CFGNode.to_graph(arcs)
    g.draw(args.pythonfile + '.png', prog ='dot')
    print(g.string())
    
  
    # Draw using tkinter.
    root = tk.Tk()
    root.title("Control Flow Graph")
    img1 = Image.open(str(args.pythonfile) + ".png")  # PIL solution
    img1 = img1.resize((800, 600), Image.ANTIALIAS)
    img = ImageTk.PhotoImage(img1)
      
    background ="gray"
  
    panel = tk.Label(root, height = 600, image = img)
    panel.pack(side = "top", fill ="both", expand = "yes")
    nodes = g.number_of_nodes()     # no. of nodes.
    edges = g.number_of_edges()     # no. of Edges.
    complexity = edges - nodes + 2         # Cyclomatic complexity
  
    frame = tk.Frame(root, bg = background)
    frame.pack(side ="bottom", fill ="both", expand = "yes")
          
    tk.Label(frame, text ="Nodes\t\t"+str(nodes), bg = background).pack()
    tk.Label(frame, text ="Edges\t\t"+str(edges), bg = background).pack()
    tk.Label(frame, text ="Cyclo Complexity\t"+
             str(complexity), bg = background).pack()
  
    
    root.mainloop()from pycfg.pycfg import PyCFG, CFGNode, slurp
import argparse
import tkinter as tk
from PIL import ImageTk, Image
  
if __name__ == '__main__':
    parser = argparse.ArgumentParser()
  
    parser.add_argument('pythonfile', help ='The python file to be analyzed')
    args = parser.parse_args()
    arcs = []
  
    cfg = PyCFG()
    cfg.gen_cfg(slurp(args.pythonfile).strip())
    g = CFGNode.to_graph(arcs)
    g.draw(args.pythonfile + '.png', prog ='dot')
    print(g.string())
    
  
    # Draw using tkinter.
    root = tk.Tk()
    root.title("Control Flow Graph")
    img1 = Image.open(str(args.pythonfile) + ".png")  # PIL solution
    img1 = img1.resize((800, 600), Image.ANTIALIAS)
    img = ImageTk.PhotoImage(img1)
      
    background ="gray"
  
    panel = tk.Label(root, height = 600, image = img)
    panel.pack(side = "top", fill ="both", expand = "yes")
    nodes = g.number_of_nodes()     # no. of nodes.
    edges = g.number_of_edges()     # no. of Edges.
    complexity = edges - nodes + 2         # Cyclomatic complexity
  
    frame = tk.Frame(root, bg = background)
    frame.pack(side ="bottom", fill ="both", expand = "yes")
          
    tk.Label(frame, text ="Nodes\t\t"+str(nodes), bg = background).pack()
    tk.Label(frame, text ="Edges\t\t"+str(edges), bg = background).pack()
    tk.Label(frame, text ="Cyclo Complexity\t"+
             str(complexity), bg = background).pack()
  
    
    root.mainloop()

The file target.py contains the version of your code at the top of this response. Getting it to work with Cygwin was a little cumbersome. One of the dependencies is GraphViz (I think), and the installation did not properly set some of the paths. If anyone has any difficulties, message me, and I will walk you through it.

user3761340
  • 603
  • 5
  • 19