2

I recently acquired a trial version of some source code to check MISRA compliance before purchasing. I have run pc-lint over the C code to verify compliance, and have got an output of a huge amount of violations. I was wanting to nicify the html generated so that I can sort what violations there are. I have tried googling for something that exists already to do this with little yield, so instead i began writing a python script...

In short, the script iterates through every line of the html output multiple times in order to check for a particular string. Of course this takes a ridiculously long time to execute, I have been unable to find an elegant solution to this, but I'm hoping im missing something obvious that someone could point out... otherwise, perhaps another language would be more appropriate that would execute faster. Cheers!

#!/usr/bin/env python

import re
rule_search = re.compile("Required Rule (.*?),",re.DOTALL|re.M)
rule_search2 = re.compile("MISRA 2004 Rule (.*?)]",re.DOTALL|re.M)
line_search = re.compile("<br>(.*?)<br>",re.DOTALL|re.M)

data=open('lint-all.html').read()

unique_rules = list(set(rule_search.findall(data)))
unique_rules2 = list(set(rule_search2.findall(data)))

MISRA_Rules = unique_rules + unique_rules2
count = [0] * len(MISRA_Rules)

page_lines = {}     
pages = {}  

counts = open("pages/counts.html",'w')
counts.write("<h2>Violated Rules Count</h2><h3><ol>")
counts.close()
for i in range (len(MISRA_Rules)):
    pages[i] = open("pages/" + str(MISRA_Rules[i]).translate(None, '.') + ".html", 'w')
    pages[i].close()
    counts = open("pages/counts.html",'a+')
    counts.write("<a href=" + str(MISRA_Rules[i]).translate(None, '.') + ".html>" + str(MISRA_Rules[i]) + "</a>: <font size='3'> 0 </font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;" )
    if i%4 == 0 and i != 0:
        counts.write("<br />")


counts.write("<br /><a href=sorted.html>Total:</a> " + "<font size='3'>" + str(count) + "</font>")
counts.write("</h3>")

for i in range (len(MISRA_Rules)):
    pages[i] = open("pages/" + str(MISRA_Rules[i]).translate(None, '.') + ".html", 'a+')
    pages[i].write("<h1>MISRA Rule " + str(MISRA_Rules[i]) + "</h1>")
    pages[i].write("""<link rel="import" href="counts.html">""")
    for j in range (len(line_search.findall(data))):
        if "Rule " + str(MISRA_Rules[i]) in line_search.findall(data)[j]:
            count[i] += 1
            pages[i].write("<br>")
            pages[i].write(line_search.findall(data)[j])
            pages[i].write("</br>")

print "out"

new_html = open('pages/sorted.html', 'w')

counts = """<h2>Violated Rules Count</h2><h3><ol>"""        
for i in range (len(MISRA_Rules)):
    counts += """<a href=""" + str(MISRA_Rules[i]).translate(None, '.') + ".html" +  """>""" + str(MISRA_Rules[i]) + """</a>: <font size="3">""" + str(count[i]) + """</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"""
    if i%4 == 0 and i != 0:
        counts += """<br />"""


counts += """<br /><a href=sorted.html>Total:</a> """ + """<font size="3">""" + str(count) + """</font>"""
counts += """</h3>"""
counts.close()              
new_html.write(counts)

new_html.write(data)
new_html.close()
cafce25
  • 15,907
  • 4
  • 25
  • 31
aj1
  • 23
  • 1
  • 4
  • try beautifulsoup http://www.crummy.com/software/BeautifulSoup/bs4/doc/? – Dyno Fu Nov 27 '15 at 04:34
  • Thanks, will give beautiful soup a shot... if i can run it without installing it. Installing stuff is not an option. – aj1 Nov 27 '15 at 04:40
  • Ammendment to code for anyone reading it, the for loop for the misra rules is inside the for loop for the line data to speed up now, so significant improvement – aj1 Nov 27 '15 at 04:40
  • If the static analyser is half-decent, it should have at least a few built-in options to generate reports. – Lundin Nov 27 '15 at 07:48

1 Answers1

0

Several approaches possible.

First is to optimize existing code. It's difficult to say what's wrong with it. In this case one goes to cprofile docs and sets up a profiler. There you'll see the bottlenecks.

Second approach (most preferable to my opinion): parse data in Python, but leave HTML generation to specialized tools, such as jinja2 template engine, which is extensively used in web development. The simpler version of jinja2 is mustache, most likely that in won't require any installation.

Third approach is to do all this stuff in-browser. Add jQuery for DOM manipulation (introduce new tags and classes) and a css stylesheet (determine how new tags and classes should look like).

u354356007
  • 3,205
  • 15
  • 25