0

I use pg_trgm extension to check similarity of a text column. I want to speed it up by using additional conditions, but without success. The speed is the same. Here is my example:

    create table test (
    id serial,
    descr text,
    yesno text,
    truefalse boolean
    );
    insert into test SELECT generate_series(1,1000000) AS id, 
md5(random()::text) AS descr ; 
    update test set yesno = 'yes' where id < 500000;
    update test set yesno = 'no' where id > 499999;
    update test set truefalse = true where id < 100000;
    update test set truefalse = false where id > 99999;
    CREATE INDEX test_trgm_idx ON test USING gist (descr gist_trgm_ops);

So, when I execute query, there is no difference whether or not I use where clause.

 select descr <->  '65c141ee1fdeb269d2e393cb1d3e1c09' 
 as dist, descr, yesno, truefalse from test 
   where 
   yesno = 'yes'
   and 
   truefalse = true 
order by dist 
limit 10;

Is this right?

Stepan Pavlov
  • 119
  • 2
  • 3
  • 7
  • 1
    Well you don't have any indices on `yesno` or `truefalse`. Hence, I think Postgres would have to scan to apply this `WHERE` condition, which might take some resources. Also, the number of records left behind might still be fairly large, and in both cases Postgres has to sort by `dist` to apply the `LIMIT` clause. – Tim Biegeleisen Aug 16 '17 at 05:48
  • @TimBiegeleisen, I added index for boolean column, it doesn't influence... – Stepan Pavlov Aug 16 '17 at 07:30
  • Well the boolean column only has two possible values, so the cardinality is high and therefore an index might not be helpful. – Tim Biegeleisen Aug 16 '17 at 07:36

1 Answers1

1

After creating test data do an ANALYZE to make sure statistics are updated. Then you can use EXPLAIN to find out.

On my machine it does an index scan on test_trgm_idx to scan the rows in order so it can stop when the limit is reached. With the where is actually slightly more work because it has to scan more rows before the limit is reached thought the time difference is not noticable.

Eelke
  • 20,897
  • 4
  • 50
  • 76