14

I'm using Django Paginator everywhere on my website and even wrote a special template tag, to make it more convenient. But now I got to a state, where I need to make a complex custom raw SQL query, that without a LIMIT will return about 100K records.

How can I use Django Pagintor with custom query?

Simplified example of my problem:

My model:

class PersonManager(models.Manager):

    def complicated_list(self):

        from django.db import connection

        #Real query is much more complex        
        cursor.execute("""SELECT * FROM `myapp_person`""");  

        result_list = []

        for row in cursor.fetchall():
            result_list.append(row[0]); 

        return result_list


class Person(models.Model):
    name      = models.CharField(max_length=255);
    surname   = models.CharField(max_length=255);     
    age       = models.IntegerField(); 

    objects   = PersonManager();

The way I use pagintation with Django ORM:

all_objects = Person.objects.all();

paginator = Paginator(all_objects, 10);

try:
    page = int(request.GET.get('page', '1'))
except ValueError:
    page = 1

try:
    persons = paginator.page(page)
except (EmptyPage, InvalidPage):
    persons = paginator.page(paginator.num_pages)

This way, Django get very smart, and adds LIMIT to a query when executing it. But when I use custom manager:

all_objects = Person.objects.complicated_list();

all data is selected, and only then python list is sliced, which is VERY slow. How can I make my custom manager behave similar like built in one?

Chillar Anand
  • 27,936
  • 9
  • 119
  • 136
Silver Light
  • 44,202
  • 36
  • 123
  • 164

4 Answers4

15

Looking at Paginator's source code, page() function in particular, I think that it's only matter of implementing slicing on your side, and translating that to relevant LIMIT clause in SQL query. You might also need to add some caching, but this starts to look like QuerySet, so maybe you can do something else:

  • you can create database VIEW using CREATE VIEW myview AS [your query];
  • add Django model for that VIEW, with Meta: managed=False
  • use that model like any other model, including slicing its querysets - this means it's perfectly suitable for using with Paginator

(For your information - I've been using this approach for a long time now, even with complex many-to-many relationships with VIEWs faking m2m intermediate tables.)

Tomasz Zieliński
  • 16,136
  • 7
  • 59
  • 83
  • Wow, thats cool :) Thank's for the reply, but I guiess I'm looking in a wrong directions. I've asked another question: http://stackoverflow.com/questions/2532686/django-getting-the-list-of-related-records-for-a-list-of-objects Guiess custom managers are not the best thing here. – Silver Light Mar 28 '10 at 11:21
6

Here is a RawPaginator class I made that overrides Paginator to work with raw queries. It takes one additional argument, count, which is the total count of your query. It doesn't slice the object_list because you must paginate in your raw query via OFFSET and LIMIT.

from django.core.paginator import Paginator

class RawPaginator(Paginator):
    def __init__(self, object_list, per_page, count, **kwargs):
        super().__init__(object_list, per_page, **kwargs)
        self.raw_count = count

    def _get_count(self):
        return self.raw_count
    count = property(_get_count)

    def page(self, number):
        number = self.validate_number(number)
        return self._get_page(self.object_list, number, self)
Jonathan Potter
  • 2,981
  • 1
  • 24
  • 19
2

I don't know about Django 1.1 but if you can wait for 1.2 (which shouldn't be that long anymore) you can make use of objects.raw() as described in this article and in the development documentation.

Otherwise, if you query is not too complex, maybe using the extra clause is sufficient.

Felix Kling
  • 795,719
  • 175
  • 1,089
  • 1,143
  • Thank you for a usefull tip. But I guess it does't help in my situation – Silver Light Mar 28 '10 at 11:24
  • 1
    You still can't get count for a raw query results. It looks like you actually have to do list(objects.raw()) for it to work with paginator. Thanks to http://stackoverflow.com/questions/2317452/django-count-rawqueryset for that info. – Josh Jun 19 '12 at 00:03
2

I also wanted to plug a PaginatedRawQuerySet which I wrote (please consider this as an alpha version). This adds the slicing ability to a raw queryset. Please refer to this answer – which I wrote for another question with a similar requirement – in order to understand how it works (especially the "A word of caution" section in the end).

from django.db import models
from django.db.models import sql
from django.db.models.query import RawQuerySet


class PaginatedRawQuerySet(RawQuerySet):
    def __init__(self, raw_query, **kwargs):
        super(PaginatedRawQuerySet, self).__init__(raw_query, **kwargs)
        self.original_raw_query = raw_query
        self._result_cache = None

    def __getitem__(self, k):
        """
        Retrieves an item or slice from the set of results.
        """
        if not isinstance(k, (slice, int,)):
            raise TypeError
        assert ((not isinstance(k, slice) and (k >= 0)) or
                (isinstance(k, slice) and (k.start is None or k.start >= 0) and
                 (k.stop is None or k.stop >= 0))), \
            "Negative indexing is not supported."

        if self._result_cache is not None:
            return self._result_cache[k]

        if isinstance(k, slice):
            qs = self._clone()
            if k.start is not None:
                start = int(k.start)
            else:
                start = None
            if k.stop is not None:
                stop = int(k.stop)
            else:
                stop = None
            qs.set_limits(start, stop)
            return qs

        qs = self._clone()
        qs.set_limits(k, k + 1)
        return list(qs)[0]

    def __iter__(self):
        self._fetch_all()
        return iter(self._result_cache)

    def count(self):
        if self._result_cache is not None:
            return len(self._result_cache)

        return self.model.objects.count()

    def set_limits(self, start, stop):
        limit_offset = ''

        new_params = tuple()
        if start is None:
            start = 0
        elif start > 0:
            new_params += (start,)
            limit_offset = ' OFFSET %s'
        if stop is not None:
            new_params = (stop - start,) + new_params
            limit_offset = 'LIMIT %s' + limit_offset

        self.params = self.params + new_params
        self.raw_query = self.original_raw_query + limit_offset
        self.query = sql.RawQuery(sql=self.raw_query, using=self.db, params=self.params)

    def _fetch_all(self):
        if self._result_cache is None:
            self._result_cache = list(super().__iter__())

    def __repr__(self):
        return '<%s: %s>' % (self.__class__.__name__, self.model.__name__)

    def __len__(self):
        self._fetch_all()
        return len(self._result_cache)

    def _clone(self):
        clone = self.__class__(raw_query=self.raw_query, model=self.model, using=self._db, hints=self._hints,
                               query=self.query, params=self.params, translations=self.translations)
        return clone
Community
  • 1
  • 1
BitParser
  • 3,748
  • 26
  • 42