5

In Wagtail's documentation on ElasticSearch indexing, it seems that all instances of a given model are added to the index. But I'd like to exclude some (many) rows from being indexed, either by creating a QuerySet or by being able to set an exclude param of some kind (QuerySet would be better).

Is there any way to do this? Or do I need to index WT models from outside of Wagtail?

shacker
  • 14,712
  • 8
  • 89
  • 89

4 Answers4

7

You can define a get_indexed_objects method on the model class, returning a queryset of items to be indexed:

@classmethod
def get_indexed_objects(cls):
    return cls.objects.filter(live=True)
gasman
  • 23,691
  • 1
  • 38
  • 56
2

If you want to exclude an entire Wagtail Page model from being indexed at all, this seems to work (as an instance method):

def get_indexed_instance(self):
    return None

For reference:

Nick
  • 2,803
  • 1
  • 39
  • 59
1

If you just need to perform some simple filtering on the results, I'd recommend indexing everything and doing the filtering at query-time (unless you're excluding quite a lot of of documents to save hard disk space):

MyModel.objects.filter(live=True).search("..")

Wagtail will convert that filter into part of the Elasticsearch query, so this shouldn't have any noticable effect on performance. This does require all the filter fields to be indexed using index.FilterField though (Wagtail has done this for all the basic page fields if you are using the page model).

The main advantage of this approach is that is lets you easily drop the filter if you ever need to do this for a separate search feature in the future. For example, Wagtail does this to allow searching all pages in the admin, but only live ones on the frontend.

  • Thanks Karl. In our case, there may be some privacy issues we need to be careful about surrounding certain records, so keeping them out of the index is important. But as a general approach, you're right - indexing everything and then filtering gives you more flexibility in general. – shacker Feb 24 '17 at 22:38
1

You can exclude an entire Wagtail Page model from being indexed by adding this to your model:

@classmethod
def get_indexed_objects(cls):
    """
    Hide model from search results.
    """
    return cls.objects.none()

(This is a tweak to Nick's answer.)

Mark Chackerian
  • 21,866
  • 6
  • 108
  • 99