3

First of all, since this is my first question/post, I would like to thank you all for this great community, and amazing service, as -- like many developers around the world -- stackoverflow -- is my main resource when it comes to code issues.

Notice :

This post is a bit long (sorry), covers two different -- but related -- aspects of the situation, and is organised as follows :

  1. Background / Context
  2. A design issue, on which i would like to receive your advice.
  3. The solution I'm trying to implement to solve 2.
  4. The actual issue, and question, related to 3.

1. Some context :

I have two different Model objects (Report and Jobs) from an existing implementation that has a poor design. Fact is that both objects are quite similar in purpose, but were probably implemented in two different time frames.

A lot of processing happen on these objects, and since the the system has to evolve, I started to write a metaclass/interface, from which both will be subclasses. Currently both Models use different Fields names for same purpose, like author and juser to denote User (which is very stupid) and so on.

Since I cannot afford to just change the columns names in the database, and then go through thousands of lines of code to change every references to these fields (even though I could, thanks to Find Usages feature of modern IDEs), and also because theses object might be used somewhere else, I used the db_column= feat. to be able in each model to have the same field name and ultimately handle both object alike (instead of having thousands of line of duplicated code to do the same stuff).

So, I have something like that :

from django.db import models
class Report(Runnable):
    _name = models.CharField(max_length=55, db_column='name')
    _description = models.CharField(max_length=350, blank=True, db_column='description')
    _author = ForeignKey(User, db_column='author_id')
    # and so on

class Jobs(Runnable):
    _name = models.CharField(max_length=55, db_column='jname')
    _description = models.CharField(max_length=4900, blank=True, db_column='jdetails')
    _author = ForeignKey(User, db_column='juser_id')
    # and so on

As i said earlier, to avoid rewriting object's client code, I used properties that shadows the fields :

from django.db import models
class Runnable(models.Model):   
    objects = managers.WorkersManager() # The default manager.

    @property # Report
    def type(self):
        return self._type

    @type.setter # Report
    def type(self, value):
        self._type = value

    # for backward compatibility, TODO remove both
    @property # Jobs
    def script(self):
        return self._type

    @script.setter # Jobs
    def script(self, value):
        self._type = value
    # and so on


2. The design issue :

This is nice and it's kind of what I wanted, except now using Report.objects.filter(name='something') or Jobs.objects.filter(jname='something') won't work, obviously due to Django design (and so on with .get(), .exclude() etc...), and the client code is sadly full of those.
I'm of course planning to replace them all with methods of my newly created WorkersManager

Aparté :
Wait ... what ? "newly created WorkersManager" ??
Yes, after two years and thousands of line of code, there was no Manager in here, crazy right ?
But guess what ? that's the least of my concerns; and cheer up, since most of the code still lies in view.py and assosiated files (instead of being properly inside the objects it is suposed to manipulate), basicaly somewhat "pure" imperative python...

Great right ?


3. My solution :

After a lot of reading (here and there) and research about that, I found out that :

  1. Trying to subclass Field, was not a solution
  2. I could actually overload QuerySet.

So I did :

from django.db.models.query_utils import Q as __originalQ
class WorkersManager(models.Manager):   
    def get_queryset(self):
        class QuerySet(__original_QS):
            """
            Overloads original QuerySet class
            """
            __translate = _translate # an external fonction that changes the name of the keys in kwargs

            def filter(self, *args, **kwargs):
                args, kwargs = self.__translate(*args, **kwargs)
                super(QuerySet, self).filter(args, kwargs)

            # and many more others [...]
        return QuerySet(self.model, using=self._db)

And this is quite fine.

4. So what's wrong ?

The problem is that Django internally uses Q inside db.model.query, using its own imports, and nowhere Q is exposed or referenced, so it could be overloaded.

>>> a =Report.objects.filter(name='something')
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/venv/local/lib/python2.7/site-packages/django/db/models/manager.py", line 143, in filter
    return self.get_query_set().filter(*args, **kwargs)
  File "/venv/local/lib/python2.7/site-packages/django/db/models/query.py", line 624, in filter
    return self._filter_or_exclude(False, *args, **kwargs)
  File "/venv/local/lib/python2.7/site-packages/django/db/models/query.py", line 642, in _filter_or_exclude
    clone.query.add_q(Q(*args, **kwargs))
  File "/venv/local/lib/python2.7/site-packages/django/db/models/sql/query.py", line 1250, in add_q
    can_reuse=used_aliases, force_having=force_having)
  File "/venv/local/lib/python2.7/site-packages/django/db/models/sql/query.py", line 1122, in add_filter
    process_extras=process_extras)
  File "/venv/local/lib/python2.7/site-packages/django/db/models/sql/query.py", line 1316, in setup_joins
    "Choices are: %s" % (name, ", ".join(names)))
FieldError: Cannot resolve keyword 'name' into field. Choices are: _author, _description, _name, # and many more

But I do remember reading something about how Django only loads the first occurrence of a Model, and how you could trick it by redefining such a Model before using import (well obviously this doesn't apply to python)
So ultimately I tried to overload Q, by redefining it before importing relevant class, or after, but I cannot possibly figure it out.

Here is what I tried :

from django.db.models.query_utils import Q as __originalQ

__translation = {'name': '_name',} # has much more, just for exemple

def _translate(*args, **kwargs):
    for key in kwargs:
        if key in __translation.keys():
            kwargs[__translation[key]] = kwargs[key]
            del kwargs[key]
    return args, kwargs

class Q(__originalQ):
    """
    Overloads original Q class
    """
    def __init__(self, *args, **kwargs):
        super(Q, self).__init__(_translate(*args, **kwargs))

# now import QuerySet which should use the new Q class
from django.db.models.query import QuerySet as __original_QS

class QuerySet(__original_QS):
    """
    Overloads original QuerySet class
    """
    __translate = _translate # writing shortcut

    def filter(self, *args, **kwargs):
        args, kwargs = self.__translate(*args, **kwargs)
        super(QuerySet, self).filter(args, kwargs)
    # and some others

# now import QuerySet which should use the new QuerySet class
from django.db import models

class WorkersManager(models.Manager):
    def get_queryset(self):
        # might not even be required if above code was successful
        return QuerySet(self.model, using=self._db)

This of course has no effect, as Q gets re-imported from django.db.model.query in the definition of _filter_or_exclude.
So of course, an intuitive solution would be to overload _filter_or_exclude, and copy its original code without calling the super
But here is the catch : I'm using an old version of Django, that might be updated someday, and I don't want to mess with Django implementation specifics, as I already did with get_queryset, but I guess this is kind of ok since it's (as far as I understand) a placeholder for overloading, and it was also the only way.

So here I am, and my question is :
Is there no other way to do it ? is there no way for me to overload Q inside of a Django module ?

Thank you very much for reading all the way :)

here is a potato (Oups, wrong website, sorry :) )

EDIT :

So, after trying to overload _filter_or_exclude, it seems that it has no effect.
I'm probably missing something about the call stack order or something alike... I'm continue tomorrow, and let you know.

Clem
  • 51
  • 9
  • I think every new hack will make the problem even more complicated. Honestly, don't think there is a silver bullet for that, you'll end up overloading every part of Django db for (probably) more errors to come. Sadly, imho, cleaning up/refactoring the code could be worthier. – Lorenzo Peña Jul 08 '15 at 18:39

1 Answers1

0

Yeay ! I found the solution.

Turns out that first, forgot to have a return in my functions, like :

def filter(self, *args, **kwargs):
    args, kwargs = self.__translate(*args, **kwargs)
    super(QuerySet, self).filter(args, kwargs)

Instead of :

def filter(self, *args, **kwargs):
    args, kwargs = self.__translate(*args, **kwargs)
    return super(QuerySet, self).filter(args, kwargs)

and I also had :

args, kwargs = self.__translate(*args, **kwargs)

Instead of :

args, kwargs = self.__translate(args, kwargs)

which cause unpack on fucntion call, and thus eveything from original kwargs ended up in args, thus preventing translate to have any effect.

But even worst, I failed to understand that I could directly overload filter , get and so on, directly from my custom manager...
Which saves me the effort of dealing with QuerySet and Q.

In the end the following code is working as expected :

def _translate(args, kwargs):
    for key in kwargs.keys():
        if key in __translation.keys():
            kwargs[__translation[key]] = kwargs[key]
            del kwargs[key]
    return args, kwargs

class WorkersManager(models.Manager):
    def filter(self, *args, **kwargs):
        args, kwargs = _translate(args, kwargs)
        return super(WorkersManager, self).filter(*args, **kwargs)

    # etc...
Clem
  • 51
  • 9