3

all! I have a table created in PostgreSQL 9.3 that contains about 24 millions records with the following structure:

CREATE TABLE jtest
(
  recordid uuid NOT NULL,
  details json,
  tsv tsvector,
  CONSTRAINT jtest_pkey PRIMARY KEY (recordid)
)
WITH (
  OIDS=FALSE
);

Column tsv filled by trigger on the grounds of json contents: to_tsvector('english', json_extract_path_text(details, 'info').

Field info in json structure have similar content. For example, "Milk from manufacturer one", "Milk from another two". There are about 1 million records that contains "milk".

If i use the following query:

select recordid from jtest where tsv @@ to_tsquery('english', 'milk');

and dont use any index on tsv column, that query takes about 250 seconds. If i create gin index on tsv field then that query takes about 200 seconds.

Is there a possibility to increase performance?

user3111365
  • 31
  • 1
  • 2

1 Answers1

3

The answer lies in the question:

There are about 1 million records that contains "milk".

Index or not, you still need to retrieve that million rows. Also note that if a million means most rows, Postgres will ignore the index entirely and seq scan the entire table.

The index will help if you change the query to:

select recordid from jtest where tsv @@ to_tsquery('english', 'rare string');
Denis de Bernardy
  • 75,850
  • 13
  • 131
  • 154