1

I'm doing a pattern matching search with ILIKE in our system, but it gets too slow with some tables due to the amount of records in the table. So I'm implementing trigram index following instructions in this post https://www.depesz.com/2011/02/19/waiting-for-9-1-faster-likeilike/. I'm not using full text search because I need searchs like '%xxx%' and full text search does not work well with that. The test table has 16000 records and I have created a new column in the table for the search concatenating some other columns.

I've run some test and this are de results:

SELECT * FROM "table" WHERE "searchField" ILIKE '%ose%'

NO INDEX 1639 rows 30.3 sec. avg.

GIN INDEX 1639 rows 26.4 sec. avg.

SELECT * FROM "table" WHERE "searchField" ILIKE '%ose%' OR "searchField" ILIKE '%ria%'

NO INDEX 1639 rows 3:02 min. avg.

GIN INDEX 1639 rows 2.56 min. avg.

As you can see it's not a great inprovement, the post said that query time reduce to miliseconds. The explain analyze shows this:

Bitmap Heap Scan on "table"  (cost=22.31..1827.93 rows=1331 width=511)
                             (actual time=0.354..4.644 rows=1639 loops=1)
  Recheck Cond: (("searchField")::text ~~* '%ose%'::text)
  Heap Blocks: exact=585
  ->  Bitmap Index Scan on "table_trgm_gin"  (cost=0.00..21.98 rows=1331 width=0)
                                             (actual time=0.276..0.276 rows=1639 loops=1)
        Index Cond: (("searchField")::text ~~* '%ose%'::text)

The index scan is fast but the condition recheck is too slow. I have read that rechecking is unavoidable due to false positives posibilities. But then I don´t know how to get better results.

Can anyone explain why the index does not make much of a difference and how to get better query times?

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
thedummy
  • 11
  • 3

1 Answers1

0

The EXPLAIN (ANALYZE) you show must be from a different table, because there the duration is under 5 milliseconds.

I notice that the patterns you search for are very short (3 characters).

Trigram indexes don't perform good for short patterns, because many rows will match during the index scan and all of these rows have to be rechecked.

Two things to check to see if my analysis is correct:

  • Test with longer patterns and see if the performance improves.
  • Look at the EXPLAIN (ANALYZE) output of the query that takes three minutes and see if a lot of rows are found during the index scan.

If I am right, there is not much you can do. Looking for short patterns just doesn't perform very well. You could try to limit the minimum length of characters in a pattern to avoid the problem.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • The `EXPALIN ANALYZE` is for the same table, with the first example I put in my cuestion. Use longer patterns is kind of useful but the real important thing is the number of matchs '%educativa%' takes 45 secs for 3608 row but '%josefa%' takes 300 milisecs for 28 row. But the same thing happens without the index, in fact in my tests longer patterns without index are faster. Thank you anyway for your answer – thedummy Nov 28 '17 at 14:52
  • Hmm, how can that be? You say that the first query took 26 seconds, yet the `EXPLAIN` says: actual time = 4.644 milliseconds. Could you retry with `EXPLAIN (ANALYZE, BUFFERS)`? Maybe that can clarify the problem. – Laurenz Albe Nov 28 '17 at 16:51