1

I want to implement something a bit similar to django fixture system where in fixture you set model property which indicates a model class of fixture. It looks something like this

my_app.models.my_model

My question is what is the standard way to process such a string in order to create the instance of the class pointed by this "path". I think it should look something like:

  1. Split it to module name and class name parts
  2. Load module (if not loaded)
  3. Acquire class from module by its name
  4. Instantiate it

How exactly should I do it?

Edit: I came up with a dirty solution:

def _resolve_class(self, class_path):
    tokens = class_path.split('.')
    class_name = tokens[-1]
    module_name = '.'.join(tokens[:-1])
    exec "from %s import %s" % (module_name, class_name)
    class_obj = locals()[class_name]
    return class_obj

That does it's job however is dirty because of usage of exec and possibility of manipulating execution by malicious preparation of fixtures. How should it be done properly?

yakxxx
  • 2,841
  • 2
  • 21
  • 22

4 Answers4

2

I often use django's import_module function since it's used around the framework to do exactly this.

from django.utils.importlib import import_module

path = 'foo.bar.classname'
module_path, class_name = path.rsplit('.', 1)
module = import_module(module_path)
cls = getattr(module, class_name)()  # instantiated!

As for the exec in your solution, you can easily remove it by using __import__ on the module, then getting the class.

Yuji 'Tomita' Tomita
  • 115,817
  • 29
  • 282
  • 245
1

EDIT: why not just do essentially what the loaddata command does? https://github.com/django/django/blob/master/django/core/management/commands/loaddata.py#L184 using the django deserialize function: https://docs.djangoproject.com/en/dev/topics/serialization/#deserializing-data

scytale
  • 12,346
  • 3
  • 32
  • 46
  • Won't `json.loads` just give me the dict? I want to have full instance of class pointed by `model` field and initiated with `fields` data. – yakxxx Oct 22 '12 at 16:26
  • Yes this answer is a bit more relevant, however my real concern is how to do it outside of django. I've just read through django serializers code and it seems that it somehow keep track of it's registered models. I want a generic method to achieve same goal but without usage of django internals (however I probably emphasized it too weekly in the question) – yakxxx Oct 22 '12 at 17:12
  • ah. well I guess wholesale copy and paste from `django.core.serializers` is the best approach so. are you using Django ORM or not? you question seems to indicate that you are. – scytale Oct 22 '12 at 17:18
  • no i'm not using django orm. And `django.core.serializers` seems to do it a bit from the other side. It somehow registers models when they are imported and than accesses by dict lookup when deserialization happens. As i wrote i want to make it work without django infrastructure, so i cannot just copy-paste it – yakxxx Oct 22 '12 at 17:34
  • ah that clarifies things. I'll post a new answer. – scytale Oct 22 '12 at 17:45
1

Note that the danger of using exec in a function is that it often allows an attacker to supply bogus values which will cause your function to "accidentally" execute whatever code the attacker wants. Here you're directly writing a function that allows precisely that! Using exec doesn't make it much worse. The only difference is that without exec they have to figure out how to get their code into a file on python's import path.

That doesn't mean you shouldn't do it. Just be aware of what you're doing. Plugin frameworks inherently have this problem; the whole point of making a framework extensible at runtime is that you want whoever can configure the plugins to be able to execute whatever code they like inside your program. If your program will be used in an environment where the the end users are not the same people who are configuring the plugins, make sure you treat _resolve_class the same way you treat exec; don't allow users to enter strings which you directly pass to _resolve_class!

Now, that aside, you can avoid the use of exec quite easily. Python has a built-in function __import__ for getting at the underlying implementation of the import mechanism. You can use it to do dynamic imports (help(__import__) was enough for me to figure out how it works to write this answer; there is also the docs if you need a bit more detail). Using that, your function could look something like:

def _resolve_class(self, class_path):
    modulepath, classname = class_path.rsplit('.', 1)
    module = __import__(modulepath, fromlist=[classname])
    return getattr(module, classname)

(Note that I've also used rsplit with a maximum number of splits to avoid having to split the module path only to rejoin it again)

Ben
  • 68,572
  • 20
  • 126
  • 174
0

just copy the format used by the Django json serializer and write object creation code. It defines a list of dicts. Each dict contains a number of meta-data fields including model which contains the app name and model name and also a nested dict containing the model field data:

[
    {
        "pk": 1,
        "model": "app_name.model_name",
        "fields": {
            "default_value": "",
            "category": null,
            "position": 10,
            ...
        }
    },
    {
        "pk": 2,
        "model": "app_name.model_name",
        "fields": {
            ...
        }
    }
]

so use json.loads() to get the data then iterate over the dicts and pull out the data

for obj_dict in json.loads(data):
    app_name, model_name = obj_dict.split('.')
    # some magic here to import your model
    my_model = magic_import_function(app_name, model_name)
    foo = my_model(*obj_dict["fields"])
    foo.save()
scytale
  • 12,346
  • 3
  • 32
  • 46
  • Yes, and now here's where my question goes. Define `magic_import_function`. It is the point of my question from the very beginning. @scytale: See edit please. – yakxxx Oct 22 '12 at 18:13