3

I have model with title and description fields.

I want to create a GIN index for all the words in the title and description field

So I do it the following way using SQL:

STEP1: Create a table with all the words in title and description using simple config

CREATE TABLE words AS SELECT word FROM  ts_stat('SELECT to_tsvector(''simple'',COALESCE("articles_article"."title", '''')) || to_tsvector(''simple'',COALESCE("articles_article"."description", '''')) FROM "articles_article"');

STEP2: Create GIN index

CREATE INDEX words_idx ON words USING GIN (word gin_trgm_ops);

STEP3: SEARCH

SELECT word, similarity(word, 'sri') AS sml
  FROM words
  WHERE word % 'sri'
  ORDER BY sml DESC, word;

Result:

word  sml
sri 1
srila 0.5
srimad  0.428571

How to do this in DJANGO and also i have to keep updating the GIN index

raratiru
  • 8,748
  • 4
  • 73
  • 113
Santhosh
  • 9,965
  • 20
  • 103
  • 243

1 Answers1

5

Django docs suggest that you install the relevant btree_gin_extension and append the following to the model's Meta class:

from django.contrib.postgres.indexes import GinIndex

class MyModel(models.Model):
    the_field = models.CharField(max_length=512)

    class Meta:
        indexes = [GinIndex(fields=['the_field'])]

A relevant answer can be found here.

Regarding the updating of the index, heroku suggests:

Finally, indexes will become fragmented and unoptimized after some time, especially if the rows in the table are often updated or deleted. In those cases it may be required to perform a REINDEX leaving you with a balanced and optimized index. However be cautious about reindexing big indexes as write locks are obtained on the parent table. One strategy to achieve the same result on a live site is to build an index concurrently on the same table and columns but with a different name, and then dropping the original index and renaming the new one. This procedure, while much longer, won’t require any long running locks on the live tables.

raratiru
  • 8,748
  • 4
  • 73
  • 113