PostgreSQL. Full-text search performance

Question

all! I have a table created in PostgreSQL 9.3 that contains about 24 millions records with the following structure:

CREATE TABLE jtest
(
  recordid uuid NOT NULL,
  details json,
  tsv tsvector,
  CONSTRAINT jtest_pkey PRIMARY KEY (recordid)
)
WITH (
  OIDS=FALSE
);

Column tsv filled by trigger on the grounds of json contents: to_tsvector('english', json_extract_path_text(details, 'info').

Field info in json structure have similar content. For example, "Milk from manufacturer one", "Milk from another two". There are about 1 million records that contains "milk".

If i use the following query:

select recordid from jtest where tsv @@ to_tsquery('english', 'milk');

and dont use any index on tsv column, that query takes about 250 seconds. If i create gin index on tsv field then that query takes about 200 seconds.

Is there a possibility to increase performance?

Why is someone searching for milk if it's in every record? – Neil McGuigan Jul 29 '14 at 18:37 — Neil McGuigan, Jul 29 '14 at 18:37

score 3 · Answer 1 · answered Dec 17 '13 at 15:22

The answer lies in the question:

There are about 1 million records that contains "milk".

Index or not, you still need to retrieve that million rows. Also note that if a million means most rows, Postgres will ignore the index entirely and seq scan the entire table.

The index will help if you change the query to:

select recordid from jtest where tsv @@ to_tsquery('english', 'rare string');

PostgreSQL. Full-text search performance

1 Answers1