How to output all unassigned strings in a python file

Question

I have a python file (a script) that looks like this:

script.py

"""
Multiline comment with unique
text pertaining to the Foo class
"""
class Foo():
    pass


"""
Multiline comment with unique
text pertaining to the Bar class
"""
class Bar():
    pass


"""
Multiline comment with unique
text pertaining to the FooBar class
"""
class FooBar():
    pass


def print_comments():
    # NotImplementedError

Is there some way for print_comments to detect and output all the unassigned strings so I could see this:

Multiline comment with unique text pertaining to the Foo class

Multiline comment with unique text pertaining to the Bar class

Multiline comment with unique text pertaining to the FooBar class

score 2 · Accepted Answer · edited Aug 30 '16 at 03:55

Assuming the formatting you indicated in your question, something like this should do it:

class Show_Script():
    def construct(self):
        with open(os.path.abspath(__file__)) as f:
            my_lines = f.readlines()

        comments = []
        in_comment = 0

        for line in my_lines:
            # detected the start of a comment
            if line.strip().startswith('"""') and in_comment == 0:
                in_comment = 1
                comments.append('')
            # detected the end of a comment
            elif line.strip().endswith('"""') and in_comment == 1:
                in_comment = 0
            # the contents of a comment
            elif in_comment == 1:
                comments[-1] += line

        print '\n'.join(comments)

Thanks! I modified your code a little to open itself and this does the trick. — Charles Clayton, Aug 30 '16 at 03:55

score 1 · Answer 2 · answered Aug 30 '16 at 04:01

Using regex:

$ cat script.py
from __future__ import print_function
import sys, re

"""
Multiline comment with unique
text pertaining to the Foo class
"""
class Foo():
    pass


"""
Multiline comment with unique
text pertaining to the Bar class
"""
class Bar():
    pass


"""
Multiline comment with unique
text pertaining to the FooBar class
"""
class FooBar():
    pass

def print_comments():
    with open(sys.argv[0]) as f:
        file_contents = f.read()

    map(print, re.findall(r'"""\n([^"""]*)"""', file_contents, re.S))

print_comments()
$ python script.py
Multiline comment with unique
text pertaining to the Foo class

Multiline comment with unique
text pertaining to the Bar class

Multiline comment with unique
text pertaining to the FooBar class

Regex Explanation:

"""\n([^"""]*)"""

Regular expression visualization

Debuggex Demo

The ideal way to do this would be to use the ast module, parse the entire document and then print call ast.get_docstring on all nodes of type ast.FunctionDef, ast.ClassDef or ast.Module. However, your comments are not docstrings. If the file would have been something like this:

$ cat script.py

import sys, re, ast

class Foo():
    """
    Multiline comment with unique
    text pertaining to the Foo class
    """
    pass


class Bar():
    """
    Multiline comment with unique
    text pertaining to the Bar class
    """
    pass


class FooBar():
    """
    Multiline comment with unique
    text pertaining to the FooBar class
    """
    pass

def print_docstrings():
    with open(sys.argv[0]) as f:
        file_contents = f.read()

    tree = ast.parse(file_contents)
    class_nodes = filter((lambda x: type(x) in [ast.ClassDef, ast.FunctionDef, ast.Module]), ast.walk(tree))
    for node in class_nodes:
        doc_str = ast.get_docstring(node)
        if doc_str:
            print doc_str

print_docstrings()

$ python script.py
Multiline comment with unique
text pertaining to the Foo class
Multiline comment with unique
text pertaining to the Bar class
Multiline comment with unique
text pertaining to the FooBar class

How to output all unassigned strings in a python file

script.py

2 Answers2