The ast
module can do this fairly easily. If we assume the source is stored in a variable named source
(could be read from a file):
import ast
root = ast.parse(source)
names = sorted({node.id for node in ast.walk(root) if isinstance(node, ast.Name)})
That loses ordering to gain uniquification and friendly display order, but you could just use a list comprehension or generator expression instead of a set comprehension if you don't need uniqueness but want ordering. The resulting list
is:
['a', 'b', 'c', 'do_something', 'f', 'myList', 'range', 'someMethod', 'something', 'x']
Unlike the other solutions posted so far, this will recurse into classes and functions to get the names used inside them, and doesn't require you to import the module or class to check, nor does it require you to implement recursive processing yourself; any syntactically valid Python code will work.
Oddly, on Python 3 (substituting a valid print
function call), you get:
['a', 'b', 'c', 'do_something', 'f', 'myList', 'print', 'range', 'someMethod', 'something']
which adds print
(as expected; it's a name now, not a keyword statement), but omits x
. You didn't ask for x
(the argument received by someMethod
), and this doesn't produce it on Python 3. Names in function prototypes appear to not create a ast.Name
node there, go figure. You can pull that info out of the ast.FunctionDef
node from the arg
attribute of each entry in the list
node.args.args
, but it's probably still not comprehensive; I suspect other definition related names might be missed, e.g. in class declarations with inheritance. You'd need to poke around with some examples to make sure you're checking everything (assuming you want stuff like x
and want to work on Python 3).
That said, x
would show up just fine if you referenced it; if you passed it to do_something
or used it in any way besides receiving and discarding it, it would show up.
You can also make an effort to only handle names assigned to, not used (to exclude do_something
, range
) by extending the test to:
names = sorted({node.id for node in ast.walk(root) if isinstance(node, ast.Name) and not isinstance(node.ctx, ast.Load)})
But that will also drop someMethod
(in both Py2 and Py3) because the definition itself doesn't produce an ast.Name
, only the use of it does. So again, you'd have to delve a little deeper into the ast.Node
internals for ast.FunctionDef
, ast.ClassDef
, etc. to get the names that aren't walk
-ed directly.