-4

(When I say "object address", I mean the string that you type in Python to access an object. For example 'life.State.step'. Most of the time, all the objects before the last dot will be packages/modules, but in some cases they can be classes or other objects.)

In my Python project I often have the need to play around with object addresses. Some tasks that I have to do:

  1. Given an object, get its address.
  2. Given an address, get the object, importing any needed modules on the way.
  3. Shorten an object's address by getting rid of redundant intermediate modules. (For example, 'life.life.State.step' may be the official address of an object, but if 'life.State.step' points at the same object, I'd want to use it instead because it's shorter.)
  4. Shorten an object's address by "rooting" a specified module. (For example, 'garlicsim_lib.simpacks.prisoner.prisoner.State.step' may be the official address of an object, but I assume that the user knows where the prisoner package is, so I'd want to use 'prisoner.prisoner.State.step' as the address.)

Is there a module/framework that handles things like that? I wrote a few utility modules to do these things, but if someone has already written a more mature module that does this, I'd prefer to use that.

One note: Please, don't try to show me a quick implementation of these things. It's more complicated than it seems, there are plenty of gotchas, and any quick-n-dirty code will probably fail for many important cases. These kind of tasks call for battle-tested code.

UPDATE: When I say "object", I mostly mean classes, modules, functions, methods, stuff like these. Sorry for not making this clear before.

Matt
  • 22,721
  • 17
  • 71
  • 112
Ram Rachum
  • 84,019
  • 84
  • 236
  • 374
  • 3
    Gotchas? What gotchas? You get a reference, you have a reference. *That* is battle-hardened. Anything involving addresses would be the quick-and-dirty method. – Ignacio Vazquez-Abrams Sep 10 '10 at 11:26
  • Thanks for the comment Ignacio. I disagree with you, as I've witnessed these gotchas myself. – Ram Rachum Sep 10 '10 at 11:31
  • @cool-RR, please, do share your experiences with us. – habnabit Sep 10 '10 at 11:55
  • @Aaron: I wouldn't want to go one-by-one with all the gotchas here, so just a couple of examples: 1. For each object in the chain you must check if it's a module, package or class, and handle it appropriately. 2. Sometimes you're not sure if something is a module, so you want to try to import it, but if there's an ImportError in the module itself you still want to raise it. – Ram Rachum Sep 10 '10 at 12:00
  • You're using the wrong programming language. Use C if you want to play with addresses. – Nick T Sep 10 '10 at 13:45
  • 2
    I'm curious - what exactly is the use case for what you're asking for? Is this for code management during the development process, or something that will apply automagic for users of your library? At first glance it sounds like your project would be better served by more effective package management/organization. Especially considering that your examples are of your own project libraries. – Jeremy Brown Sep 10 '10 at 14:45
  • @Jeremy: A few example usages: 1. Making an object's `repr` string shorter (This is point 3) 2. Referring to a function in a GUI. (This is point 4) 3. Specifying an object without having a real reference to it. (Like in Django's `settings.py` module). – Ram Rachum Sep 10 '10 at 14:53
  • 3
    Perhaps the "meta-reason" you are not getting satisfactory answers is that what you are asking to do here appears to go against many of the precepts of The Zen of Python (`import this`), such as "Explicit is better than implicit", "Special cases aren't special enough to break the rules", "Flat is better than nested", and perhaps most importantly "There should be one-- and preferably only one --obvious way to do it". You're also using terminology (`object addresses`) that is foreign to the Python world (Zen: "If the implementation is hard to explain, it's a bad idea"). But good luck! – Ned Deily Sep 10 '10 at 17:05
  • 4
    What is the problem? I fail to see how the "solution" you want is in any way necessary (let alone a good idea). `repr` strings too long? Subclass, implement new `__repr__()`. You don't want to type (e.g.) `garlicsim_lib.simpacks.prisoner.prisoner.State.step`? Then `import garlicsim_lib.simpacks.prisoner as prisoner`. If you want to dynamically import modules as needed, check if they exist then import them. – Nick T Sep 10 '10 at 19:20

4 Answers4

5

Short answer: No. What you want is impossible.

The long answer is that what you think of as the "address" of an object is anything but. life.State.step is merely one of the ways to get a reference to the object at that particular time. The same "address" at a later point can give you a different object, or it could be an error. What's more, this "address" of yours depends on the context. In life.State.step, the end object depends not just on what life.State and life.State.step are, but what object the name life refers to in that namespace.

Specific answers to your requests:

  1. The end object has no way whatsoever of finding out how you referred to it, and neither has any code that you give the object to. The "address" is not a name, it's not tied to the object, it's merely an arbitrary Python expression that results in an object reference (as all expressions do.) You can only make this work, barely, with specific objects that aren't expected to move around, such as classes and modules. Even so, those objects can move around, and frequently do move around, so what you attempt is likely to break.

  2. As mentioned, the "address" depends on many things, but this part is fairly easy: __import__() and getattr() can give you these things. They will, however, be extremely fragile, especially when there's more involved than just attribute access. It can only remotely work with things that are in modules.

  3. "Shortening" the name requires examining every possible name, meaning all modules and all local names, and all attributes of them, recrusively. It would be a very slow and time-consuming process, and extremely fragile in the face of anything with a __getattr__ or __getattribute__ method, or with properties that do more than return a value.

  4. is the same thing as 3.

Thomas Wouters
  • 130,178
  • 23
  • 148
  • 122
  • "The same "address" at a later point can give you a different object, or it could be an error." -- Usually the kind of objects that you use addresses for are classes, functions, modules etc., and they rarely change during runtime, so we can ignore these edge cases where it does change. – Ram Rachum Sep 10 '10 at 11:40
  • I'm sorry I neglected to say in my question that these are the kinds of objects I mean. I updated the question, thanks. – Ram Rachum Sep 10 '10 at 11:42
  • ""Shortening" the name requires examining every possible name" -- Okay, so you haven't thought it through. It's possible without examining every possible name, and I have a function that does this. – Ram Rachum Sep 10 '10 at 11:44
  • 2
    It really doesn't change the answer. You may wish to ignore these issues, but they are real issues, and real-world code will run into them. You can make part of what you want with `__import__()` and `getattr()`, as I mention, but there's nothing ready to use and there will be an absurd number of fragile corner cases. – Thomas Wouters Sep 10 '10 at 11:44
  • Thomas: Even the Python standard library ignores these edge cases. For example, if you pickle a class, it gets pickled by its address, and the edge case of the address changing is simply ignored. – Ram Rachum Sep 10 '10 at 11:46
  • 3
    No, a class gets pickled by *its name and the name of the module it was defined in*, not this misguided idea of 'address' you have. This is why nested classes can't be pickled. This is also why pickle is really crappy for any kind of evolving code. – Thomas Wouters Sep 10 '10 at 11:47
  • I am aware that this idea of "address" is fragile and will break for many cases. I am interesting in using it in situations where these cases are unlikely to happen. So this idea of "address" is useful to me. If you think this idea is misguided, then you are not the target audience of this idea, and you should spend your time doing something else rather than arguing with me and telling me that it's a misguided idea. – Ram Rachum Sep 10 '10 at 11:51
4

I released the address_tools module which does exactly what I asked for.

Here is the code. Here are the tests.

It's part of GarlicSim, so you can use it by installing garlicsim and doing from garlicsim.general_misc import address_tools. Its main functions are describe and resolve, which are parallel to repr and eval. The docstrings explain everything about how these functions work.

There is even a Python 3 version on the Python 3 fork of GarlicSim. Install it if you want to use address_tools on Python 3 code.

Ram Rachum
  • 84,019
  • 84
  • 236
  • 374
1

For points 3 and 4, I guess that you are looking for facilities like

from life import life  # life represents life.life
from garlicsim_lib.simpacks import prisoner

However, this is not recommended, as it makes it harder for you or people who read your code to quickly know what prisoner represents (where module does it come from? you have to look at the beginning of the code to get this information).

For point 1, you can do:

from uncertainties import UFloat

print UFloat.__module__  # prints 'uncertainties'

import sys
module_of_UFloat = sys.modules[UFloat.__module__]

For point 2, given the string 'garlicsim_lib.simpacks.prisoner', you can get the object it refers to with:

obj = eval('garlicsim_lib.simpacks.prisoner')

This supposes that you have imported the module with

import garlicsim_lib  # or garlicsim_lib.simpacks

If you even want this to be automatic, you can do something along the lines of

import imp

module_name = address_string.split('.', 1)[0]
mod_info = imp.find_module(module_name)
try:
    imp.load_module(module_name, *mod_info)
finally:
    # Proper closing of the module file:
    if mod_info[0] is not None:
        mod_info[0].close()

This works only in the simplest cases (garlicsim_lib.simpacks need to be available in garlicsim_lib, for instance).

Coding things this way is, however, highly unusual.

Eric O. Lebigot
  • 91,433
  • 48
  • 218
  • 260
  • Point 2 means: Given the string `'garlicsim_lib.simpacks.prisoner.State.step'`, get the actual object `garlicsim_lib.simpacks.prisoner.State.step`. – Ram Rachum Sep 10 '10 at 11:37
  • You've given me quick-n-dirty code. I already have some code that does this, so quick-n-dirty code is not helpful for me. I'm asking if anyone knows of a mature module that does this. – Ram Rachum Sep 10 '10 at 11:53
  • 1
    @cool-RR: Maybe, but downvoting someone who gives you an answer of that length isn't very grateful – Johannes Charra Sep 10 '10 at 13:50
  • @jellybean: Downvoting and upvoting is for saying "an answer is bad" or "answer is good". If we'll avoid downvoting in order to not hurt people's feelings we'll just have a harder time telling the good answers from the bad. Thank you EOL for taking the time to answer my question. – Ram Rachum Sep 10 '10 at 14:10
  • 1
    @cool-RR: I agree. But in this case the answer isn't "bad" but "not helpful" (at most), which maps to "no vote" in my voting decision map. – Johannes Charra Sep 10 '10 at 14:17
  • I think unhelpful answers are harmful because they clutter the question page. – Ram Rachum Sep 10 '10 at 14:32
  • @cool-RR: EOL's answer is only unhelpful to *you* personally, because you "already have some code that does this". The rest of the world doesn't have access to that code, but thanks to EOL's answer, the rest of the world now *does* have access to some admittedly-quick-and-dirty code for doing this stuff, which might be useful to people other than yourself. The answers here aren't just for you, but are also for other people asking the same or similar questions. – RichieHindle Sep 10 '10 at 15:00
  • @RichieHindle: This question was downvoted to -5, most of the people here are telling me some variation of "You're doing it wrong, what you want is wrong," and now you're concerned about other people who might find this question interesting? A much bigger problem is the "You're doing it wrong" attitude that many people showed here, rather than me downvoting an answer that seemingly ignored my ACTUAL QUESTION, which is "Is there a Python module for handling Python object addresses?" – Ram Rachum Sep 10 '10 at 15:16
  • 1
    @cool-RR: Sometimes I'll be driving along and I'll see ahead that there's someone wanting to pull out. If none of the cars in front me lets them out, I'll do it. Sometimes the driver I'm letting out is obviously frustrated at having to wait a long time, and drives speedily away without thanking me. It's him vs. the rest of the world, and I'm part of the rest of the world. EOL didn't downvote your question (I assume). Neither did he tell you you were doing it wrong. I didn't do any of that either. Please don't have a go at EOL or me for something that others did. Nor at EOL for trying to help. – RichieHindle Sep 10 '10 at 15:40
  • 1
    @RichieHindle: Thank you for your support. I assume that cool-RR might have good reasons to be wanting to do certain things, however unusual they are. – Eric O. Lebigot Sep 11 '10 at 08:58
0

Twisted has #2 as twisted/python/reflect.py . You need something like it for making a string-based configuration system, like with Django's urls.py configuration.

Take a look at the code and the version control log to see what they had to do to make it work - and fail! - the right way.

The other things you are looking for place enough restrictions on the Python environment that there is no such thing as a general purpose solution.

Here's something which somewhat implements your #1

>>> import pickle
>>> def identify(f):
...   name = f.__name__
...   module_name = pickle.whichmodule(f, name)
...   return module_name + "." + name
... 
>>> identify(math.cos)
'math.cos'
>>> from xml.sax.saxutils import handler
>>> identify(handler)
'__main__.xml.sax.handler'
>>> 

Your #3 is underdefined. If I do

__builtin__.step = path.to.your.stap

then should the search code find it as "step"?

The simplest implementation I can think of simply searches all modules and looks for top-level elements which are exactly what you want

>>> import sys
>>> def _find_paths(x):
...   for module_name, module in sys.modules.items():
...     if module is None:
...         continue
...     for (member_name, obj) in module.__dict__.items():
...       if obj is x:
...         yield module_name + "." + member_name
... 
>>> def find_shortest_name_to_object(x):
...   return min( (len(name), name) for name in _find_paths(x) )[1]
... 
>>> find_shortest_name_to_object(handler)
'__builtin__._'
>>> 5
5
>>> find_shortest_name_to_object(handler)
'xml.sax.handler'
>>> 

Here you can see that 'handler' was actually in _ from the previous expression return, making it the shortest name.

If you want something else, like recursively searching all members of all modules, then just code it up. But as the "_" example shows, there will be surprises. Plus, this isn't stable, since importing another module might make another object path available and shorter.

That's why people say over and over again that what you want isn't actually useful for anything, and that's why there's no modules for it.

And as for your #4, how in the world will any general package cater to those naming needs?

In any case, you wrote

Please, don't try to show me a quick implementation of these things. It's more complicated than it seems, there are plenty of gotchas, and any quick-n-dirty code will probably fail for many important cases. These kind of tasks call for battle-tested code.

so don't think of my examples as solutions but as examples of why what you're asking for makes little sense. It's such a fragile solution space adn the few who venture there (mostly for curiosity) have such different concerns that a one-off custom solution is the best thing. A module for most of these makes no sense, and if it did make sense the explanation of what the module does would probably be longer than the code.

And hence the answer to your question is "no, there are no such modules."

What makes your question even more confusing is that the C implementation of Python already defines an "object address". The docs for id() say:

CPython implementation detail: This is the address of the object.

What you're looking for is the name, or the path to the object. Not the "Python object address."

Andrew Dalke
  • 14,889
  • 4
  • 39
  • 54