26

I was recently tasked with maintaining a bunch of code that uses from module import * fairly heavily.

This codebase has gotten big enough that import conflicts/naming ambiguity/"where the heck did this function come from, there are like eight imported modules that have one with the same name?!"ism have become more and more common.

Moving forward, I've been using explicit members (i.e. import module ... module.object.function() to make the maintenance work I do more readable.

But I was wondering: is there an IDE or utility which robustly parses Python code and refactors * import statements into module import statements, and then prepends the full module path onto all references to members of that module?

We're not using metaprogramming/reflection/inspect/monkeypatching heavily, so if aforementened IDE/util behaves poorly with such things, that is OK.

karthikr
  • 97,368
  • 26
  • 197
  • 188
Zac B
  • 3,796
  • 3
  • 35
  • 52
  • You cannot do that in the general case. Observe a `module.py` of `import random;if random.random() > 0.5: x = 1`. – phihag Oct 01 '12 at 16:28
  • 3
    As pointed out by @phihag it's not possible to solve this for the general case. However, [this question](http://stackoverflow.com/questions/3573694/mapping-module-imports-in-python-for-easy-refactoring) should provide you with a couple building blocks to write your own tools and at least automate some of the tasks involved in that refactoring. Another useful one not mentioned there is [rope](http://rope.sourceforge.net/), a Python refactoring library. – Lukas Graf Oct 01 '12 at 17:08
  • 2
    one other thing you could try is just replacing `from module import *` with a list of explicit imports (`from module import x, y, z`) and that way you could tell what you missed when name errors pop up (or even better, you could use something like `pyflakes` to do static code analysis and tell that for you). As soon as you remove `from module import *` statements, it should be able to tell you what isn't defined. – Jeff Tratner Oct 06 '12 at 01:20
  • @Jeff There's a couple problems with that tough: 1) How do you get the list of `x, y, z`? If it's a nice module or package, it defines `__all__`, which makes this very easy. But what if not? Import the module an look at its globals? Meh. That won't work if you want to do a purely static analysis (which one should in my opinion). You'd probably have to build an [AST](http://docs.python.org/library/ast.html) and parse it. – Lukas Graf Oct 07 '12 at 12:21
  • @Jeff 2) What if a `*` import shadows a name that's already in the namespace? Like for example with `from numpy import *`, which will import `numpy.sum` and shadow the `sum()` builtin. If `from numpy import sum` is missing that _won't_ raise a `NameError`, but it will behave differently. – Lukas Graf Oct 07 '12 at 12:24
  • 2
    @LukasGraf valid points. You can actually easily get a list of what a module imports by doing the following: `old_globals = dict(globals()); from module import *; print [k for k,v in dict(globals()).items() if k not in old_globals or old_globals[k] != v]`, though this requires actually loading the module. So that could at least let you convert a line like `from numpy import *` to `from numpy import x,y,z` – Jeff Tratner Oct 07 '12 at 13:47
  • @Jeff You could, but that wouldn't be static analysis any more. So all the application code would need to be importable by the refactoring tool, with all its dependencies installed, Python versions need to match, imports of the code's modules need to be side-effect free, ... For large projects with lots of dependencies, this is at the very least a major hassle or sometimes can even be near impossible to achieve. – Lukas Graf Oct 07 '12 at 14:51
  • @LukasGraf I was thinking of external dependencies, but yeah bummer. – Jeff Tratner Oct 07 '12 at 14:57

3 Answers3

5

Not a perfect solution, but what I usually do is this:

  1. Open Pydev
  2. Remove all * imports
  3. Use the optimize imports command (ctrl+shift+o) to re-add all the imports

Roughly solves the problem :)


If you want to build a solution yourself, try http://docs.python.org/library/modulefinder.html

Wolph
  • 78,177
  • 11
  • 137
  • 148
  • 3
    That's not (fully) automated though ;) And it will inevitably fail to recognize the [`numpy.sum` problem outlined in my comment](http://stackoverflow.com/questions/12677061/is-there-an-ide-utility-to-refactor-python-imports-to-use-standad-module-membe#comment17255796_12677061), changing the code's behavior and introduce a subtle, hard to find bug. – Lukas Graf Oct 07 '12 at 12:37
  • 1
    Upvoted for a useful utility solution, but as @LukasGraf says, it's not applicable to all cases. More worrying, it can change behavior in said edge cases. A general-case utility, even for just the "static" (i.e. no self-modification, `inspect`ion, or even heavy decoration or reflection) cases is what I'm after. Currently playing with `ast` and the CPython source to see if anything can be easily hacked together. Confidence is low, however. – Zac B Oct 08 '12 at 00:46
  • @ZacB: I've had the same problem in the past and I've never found a good solution and it seemed too complex to me to build it myself (i.e. takes too much time). If you can execute the code it's relatively easy to figure out what import would come from a certain module but it is still a lot of work to get something like that working. – Wolph Oct 08 '12 at 22:48
  • @ZacB: if you are going to build it, use http://docs.python.org/library/modulefinder.html to inspect the imports :) – Wolph Oct 08 '12 at 22:50
  • Still working on a universal-case solution. I'll BitBucket it when I get it to a point at which it (at least) does no harm. – Zac B Jan 18 '13 at 01:44
  • @ZacB: good to hear, although I don't think you can make it harmless without setting constraints. For example, other modules could depend on the imports of your module and things like `globals()[some_variable]` are also impossible to predict. – Wolph Jan 18 '13 at 08:46
3

Here are the other related tools mentioned:

  • working with AST directly, which is very low-level for your use.
  • working with modulefinder which may have a lot of the boilerplate code you are looking for,
  • rope, a refactoring library (@Lucas Graf),
  • the bicycle repair man, a refactoring libary
  • the logilab-astng library used in pylint

More about pylint

pylint is a very good tool built on top of ast that is already able to tell you where in your code there are from somemodule import * statements, as well as telling you which imports are not necessary.

example:

# next is what's on line 32
from re import *

this will complain:

W: 32,0: Wildcard import re
W: 32,0: Unused import finditer from wildcard import
W: 32,0: Unused import LOCALE from wildcard import
... # this is a long list ...

Towards a solution?

Note that in the above output pylint gives you the line numbers. it might be some effort, but a refactoring tool can look at those particular warnings, get the line number, import the module and look at the __all__ list, or using a sandboxed execfile() statement to see the module's global names (would modulefinder help with that? maybe...). With the list of global names from __all__ and the names that pylint complains about, you can have two set() and proceed to get the difference. Replace the line featuring wildcard imports with specific imports.

dnozay
  • 23,846
  • 6
  • 82
  • 104
0

I wrote some refactoring tools to do just that. Star Namer will go through all of your wildcard * imports for a script and replace them with the actual functions to be imported.

Usage: ./star_namer.py module_filename script_filename


Once you've converted all of your star imports to actual names you can use from_to_import.py to fix them. This is how it works:

  1. Running your script through pylint and counting up all of the currently undefined words.

  2. Removing all of the from modname import lines from the script.

  3. Running the script through pylint again and comparing the difference in undefined words.

  4. Going through the json output of pylint (in reverse order), it determines the exact position of replacements to be made and inserts the modname. in the correct place.

I thought this approach would be a little more robust, by offloading the syntax processing to an advanced utility, that's designed for it, instead of trying to grep through all the text myself with regex expressions.

Usage: from_to_import.py script_name modname

It will show you what changes are to be made before making them. Press y to save. The main issues I've found so far are text alignment issues caused by inserting the modname. text which makes comments misaligned and it doesn't deal with aliased function names well (from ... import quickrun as qrun)

Full documentation here: https://github.com/SurpriseDog/Star-Wrangler

SurpriseDog
  • 462
  • 8
  • 18