33

I had a question on how libraries like numpy work. When I import numpy, I'm given access to a host of built in classes, functions, and constants such as numpy.array, numpy.sqrt etc.

But within numpy there are additional submodules such as numpy.testing.

How is this done? In my limited experience, modules with submodules are simply folders with a __init__.py file, while modules with functions/classes are actual python files. How does one create a module "folder" that also has functions/classes?

D_00
  • 1,440
  • 2
  • 13
  • 32
ImpGuard
  • 885
  • 2
  • 13
  • 18

1 Answers1

53

A folder with .py files and a __init__.py is called a package. One of those files containing classes and functions is a module. Folder nesting can give you subpackages.

So for example if I had the following structure:

  mypackage
     __init__.py
     module_a.py
     module_b.py
        mysubpackage
             __init__.py
             module_c.py
             module_d.py

I could import mypackage.module_a or mypackage.mysubpackage.module_c and so on.

You could also add functions to mypackage (like the numpy functions you mentioned) by placing that code in the __init__.py. Though this is usually considered to be ugly.

If you look at numpy's __init__.py you will see a lot of code in there - a lot of this is defining these top-level classes and functions. The __init__.py code is the first thing executed when the package is loaded.

chrisinmtown
  • 3,571
  • 3
  • 34
  • 43
Mike Vella
  • 10,187
  • 14
  • 59
  • 86
  • Then how is it that a library like numpy or scipy can be imported, and contain both classes/functions (like a module) and other modules (like a package). So I can do numpy.array (a class) or numpy.testing.assert... (getting a module). – ImpGuard Sep 01 '13 at 04:27
  • 1
    I've just answered that, it's all in the `__init__.py`. Any function in there will be a first-class member of the package when it is loaded by the interpreter. – Mike Vella Sep 01 '13 at 04:34
  • Ah, so that's what I was wondering. I thought it would be considered ugly since a lot of misc. logic would go in it. I presume everything was separately coded and somehow all combined into __init__.py after? It seems quite useful to have something like this but I'm not sure how to replicate it without just dumping lots of code in one file. – ImpGuard Sep 01 '13 at 04:47
  • @ImpGuard In the `__init__.py` you can import whatever files/classes/functions you want. In the `numpy`/`scipy` case they decided to provide the most used functionality there. *However*, this means that when importing `numpy` a lot of its submodules must be imported, this taken significant time. I personally prefer to avoid importing all the stuff in the `__init__.py`. I generally have an `all.py` submodule that does that, so if you want everything you can do `import X.all as X`, otherwise you can choose what to import. – Bakuriu Sep 01 '13 at 05:46
  • that's right, numpy is based on numeric so I suspect it's a legacy thing. – Mike Vella Sep 01 '13 at 05:48
  • 1
    Ah, I got it! So if I import something in the __init__.py file, those items become first class members of the package as well! However, just be really clear, that does mean that if I separated my classes into separate files (to be more organized) and just imported them all when the package is loaded, I still technically have those files as modules that are "hidden"? Meaning if the people who made numpy were organized and put np.matrix in some submodule under numpy and simply imported it when numpy is imported, there might be some hidden numpy.__matrix module that I don't know about. – ImpGuard Sep 01 '13 at 05:50
  • 1
    nothing in python is ever really "hidden", but yes you are correct if I understand you. You should make some packages and play around,it's the easiest way to learn this sort of thing! – Mike Vella Sep 01 '13 at 05:53