Suppose I have the following table:
class Word(Model):
id = UUIDField(primary_key=True,
default=uuid.uuid4,
editable=False)
name = CharField(_('Word'),
db_index=True,
max_length=250)
This table contains around 3 million rows. Content is mainly Cyrillian words.
What I want to accomplish is to search through some predefined list of values. The problem is that words in the table are accented.
Content of tables is like:
мо́рква
буря́к
я́блуко
But I query without accent.
words = ['морква', 'буряк', 'яблуко']
Word.objects.annotated(name_without_accent=Func(
F('name'),Value('[%s]' % ACCENTS), Value(''),
function='regexp_replace',
)).filter(reduce(operator.or_, [Q(name_without_accent__icontains=w) for w in words], Q()))
The problem is that this query on such big table runs very long. An words
list can contain more elements (around 20-30).
Is there anyway to improve overall performance of querying such big table? Improve Django query? Tune Postgres? Use some external tools?