10

Problem

I have a folder structure like this:

- modules
    - root
        - abc
            hello.py
            __init__.py
        - xyz
            hi.py
            __init__.py
          blah.py
          __init__.py
      foo.py
      bar.py
      __init_.py

Here is the same thing in string format:

"modules",
"modues/__init__.py",
"modules/foo.py",
"modules/bar.py",
"modules/root",
"modules/root/__init__.py",
"modules/root/blah,py",
"modules/root/abc",
"modules/root/abc/__init__.py",
"modules/root/abc/hello.py",
"modules/root/xyz",
"modules/root/xyz/__init__.py",
"modules/root/xyz/hi.py"

I am trying to print out all the modules in the python import style format. An example output would like this:

modules.foo
modules.bar
modules.root.blah
modules.root.abc.hello
modules.root.xyz.hi

How can I do this is in python(if possible without third party libraries) easily?

What I tried

Sample Code

import pkgutil

import modules

absolute_modules = []


def find_modules(module_path):
    for package in pkgutil.walk_packages(module_path):
        print(package)
        if package.ispkg:
            find_modules([package.name])
        else:
            absolute_modules.append(package.name)


if __name__ == "__main__":
    find_modules(modules.__path__)
    for module in absolute_modules:
        print(module)

However, this code will only print out 'foo' and 'bar'. But not 'root' and it's sub packages. I'm also having trouble figuring out how to convert this to preserve it's absolute import style. The current code only gets the package/module name and not the actual absolute import.

Vivek Joshy
  • 974
  • 14
  • 37
  • Why do you ask "without any third party libraries"? You are reinventing the wheel (pardon the pun), this is already implemented by `pkg_resources` (a part of the `setuptools` distribution). – wim Feb 20 '18 at 06:48
  • Well, I want to learn how to do this so I can customize it – Vivek Joshy Feb 20 '18 at 06:49
  • OK, but I'm still not seeing why that rules out third party libs. – wim Feb 20 '18 at 06:54
  • Ummm, well the reason is because someone on IRC suggested using the gather library which introduces a @decorator into all the submodules that want to be collected. This is a terrible way to collect module names. As long as the module is actually in the stdlib, it should be fine. Should also be fine if the code is an actively maintained third party lib which in most cases it is not. – Vivek Joshy Feb 20 '18 at 06:57

4 Answers4

11

This uses setuptools.find_packages (for the packages) and pkgutil.iter_modules for their submodules. Python2 is supported as well. No need for recursion, it's all handled by these two functions used together.

import sys
from setuptools import find_packages
from pkgutil import iter_modules

def find_modules(path):
    modules = set()
    for pkg in find_packages(path):
        modules.add(pkg)
        pkgpath = path + '/' + pkg.replace('.', '/')
        if sys.version_info.major == 2 or (sys.version_info.major == 3 and sys.version_info.minor < 6):
            for _, name, ispkg in iter_modules([pkgpath]):
                if not ispkg:
                    modules.add(pkg + '.' + name)
        else:
            for info in iter_modules([pkgpath]):
                if not info.ispkg:
                    modules.add(pkg + '.' + info.name)
    return modules
Flimm
  • 136,138
  • 45
  • 251
  • 267
rwst
  • 2,515
  • 2
  • 30
  • 36
  • I don't have time to verify this works. So I've unmarked my answer as the correct version. However please note: `len(find_abs_modules(xml)) == len(list(find_modules(xml.__path__[0])))` returns False and also shows `_private` modules. – Vivek Joshy Jan 23 '19 at 08:55
  • 1
    iter_modules dont act as documented at least for me and iter_modules dont work recursively – ninjaconcombre Jul 26 '20 at 15:37
2

So I finally figured out how to do this cleanly and get pkgutil to take care of all the edge case for you. This code was based off python's help() function which only displays top level modules and packages.

import importlib
import pkgutil

import sys

import modules


def find_abs_modules(module):
    path_list = []
    spec_list = []
    for importer, modname, ispkg in pkgutil.walk_packages(module.__path__):
        import_path = f"{module.__name__}.{modname}"
        if ispkg:
            spec = pkgutil._get_spec(importer, modname)
            importlib._bootstrap._load(spec)
            spec_list.append(spec)
        else:
            path_list.append(import_path)
    for spec in spec_list:
        del sys.modules[spec.name]
    return path_list


if __name__ == "__main__":
    print(sys.modules)
    print(find_abs_modules(modules))
    print(sys.modules)

This will work even for builtin packages.

Vivek Joshy
  • 974
  • 14
  • 37
  • 2
    Hmm ... it did not scan recursively in my usecase, but then this helped me coming up with my own solution: I'm using `pkg=importlib.import_module(import_path)` and then recursively call `find_abs_modules(pkg)`. – TheDiveO Dec 14 '18 at 09:30
  • @TheDiveO Can you tell me what the case was it didn't work for? I'm currently using this in a dev env and would like to patch any edge cases. – Vivek Joshy Dec 14 '18 at 09:36
  • Inside the application (package) `foobar` I start scanning from package `foobar.plugins` (which was imported before scanning) and the scan function did not find the `foobar.plugins.footest` subpackage (which I did not import yet). `footest` has an `__init__.py`, but it was never found, even after importing it inside `foobar.plugins.__init__`. So I resorted to relying solely on `importlib.import_module()` because I need to import the plugins found anyway. – TheDiveO Dec 14 '18 at 09:46
  • It's on Python 3.5.3 (Deb 9) – TheDiveO Dec 14 '18 at 09:47
  • @TheDiveO Thanks for the details.I wrote this code for for 3.6.5 upwards. So it may explain why it's not working for your case (calling private methods is a bad idea in general). I'll be updating this soon with public versions of the same functions so everyone can benefit. Thanks for your help! – Vivek Joshy Dec 14 '18 at 09:52
1
import my_module

from inspect import getmembers

print(getmembers(my_module))

This will list all the members in your module, including submodules, classes, functions, etc. You can then filter the list accordingly.

Ibolit
  • 9,218
  • 7
  • 52
  • 96
-1

The below code will give you the relative package module from the codes current working directory.

import os
import re

for root,dirname,filename in os.walk(os.getcwd()):
    pth_build=""
    if os.path.isfile(root+"/__init__.py"):
        for i in filename:
            if i <> "__init__.py" and i <> "__init__.pyc":
                if i.split('.')[1] == "py":
                    slot = list(set(root.split('\\')) -set(os.getcwd().split('\\')))
                    pth_build = slot[0]
                    del slot[0]
                    for j in slot:
                        pth_build = pth_build+"."+j
                    print pth_build +"."+ i.split('.')[0]

This code will display:

modules.foo
modules.bar
modules.root.blah
modules.root.abc.hello
modules.root.xyz.hi

If you run it outside the modules folder.

peterh
  • 11,875
  • 18
  • 85
  • 108
  • 1
    Note that you cannot rely on the existence of `__init__.py` files to solve this, especially not in newer versions of Python where a single package can span multiple different folders on a file system. For example see the docs on [namespace packages](https://packaging.python.org/guides/packaging-namespace-packages/) in Python. – ely Mar 09 '21 at 14:39
  • Not to mention that you can't rely on the code being expanded on a filesystem (or on a filesystem at all, technically). It could be a zipped package (https://docs.python.org/3/library/zipimport.html) or anything else via some custom importlib hooks. – mikenerone Mar 19 '23 at 17:08