1

I'm working on a code analyzer and I'm trying to identify all the class types referenced within a function or class in Python.

For example say I have this class:

import collections

Bar = collections.namedtuple('Bar', ['bar'])
Baz = collections.namedtuple('Baz', ['baz'])

class Foo(object):
    def get_bar(self):
        return Bar(1)
    def get_baz(self):
        return Baz(2)

I'm looking for a way I can get types of the functions and classes. Something like this:

print(get_types(Foo.get_bar)) # ['Bar']
print(get_types(Foo.get_baz)) # ['Baz']
print(get_types(Foo)) # ['Bar','Baz']
Nick Gotch
  • 9,167
  • 14
  • 70
  • 97
  • interesting, I suspect there lies a method within [`inspect`](https://docs.python.org/3.8/library/inspect.html) that can help achieve this – Chase Jul 21 '20 at 13:55
  • [This answer](https://stackoverflow.com/a/33398553/8873143) looks like it would have some potential in helping to solve your problem. – funie200 Jul 21 '20 at 14:01
  • Are you looking for type annotations perhaps? Note that whether ``return Bar(1)`` returns a ``Bar``, some other type, a random type, or blows up with an exception cannot be known from just that source code. What assumptions are you willing to make about your code? – MisterMiyagi Jul 21 '20 at 14:25
  • @MisterMiyagi Type annotations would work but will require us to add them back through a lot of older code that doesn't have them. That is a potential solution route. What I'm really looking for is all the ORM types (we're using Peewee) in each function so we can determine which functions touch which DB tables. Execution of the function isn't important for this, just if the source includes the type in any branch of its logic. – Nick Gotch Jul 21 '20 at 15:52

1 Answers1

1

One solution could involve using type annotations. Setting the return value of get_bar() to Bar and the return value of get_baz() to Baz, you could write get_types() as below...

import inspect
import collections

Bar = collections.namedtuple('Bar', ['bar'])
Baz = collections.namedtuple('Baz', ['baz'])


class Foo(object):
    def get_bar(self) -> Bar:
        return Bar(1)
    def get_baz(self) -> Baz:
        return Baz(2)


def get_types(obj):
    if inspect.isclass(obj):
        methods = [method for method in dir(obj)
                   if callable(getattr(obj, method)) and not method.startswith('_')]
        return [get_types(getattr(obj, method)) for method in methods]

    if callable(obj):
        return [obj.__annotations__['return'].__name__]


def main():
    print(get_types(Foo.get_bar)) # ['Bar']
    print(get_types(Foo.get_baz)) # ['Baz']
    print(get_types(Foo)) # ['Bar','Baz']


if __name__ == '__main__':
    main()

In get_types(obj), if obj contains an instance of a class, you can select out the non private methods of that class and return get_types() on each of these. If obj contains a function, then we just return the return attribute of that function.

houtanb
  • 3,852
  • 20
  • 21
  • Some of the code being analyzed has type annotations we use with mypy but there's a good deal of legacy code which doesn't. This seems promising but it will require us to go back through all the code and fill out all the annotations. – Nick Gotch Jul 21 '20 at 15:46