11

I have the following structure.

CREATE TABLE join_table (
  id integer NOT NULL,
  col_a integer NOT NULL,
  col_b integer NOT NULL
)

CREATE INDEX index_on_col_a ON join_table USING btree (col_a);
CREATE INDEX index_on_col_b ON join_table USING btree (col_b);
CREATE UNIQUE INDEX index_on_col_a_and_col_b ON join_table USING btree (col_a, col_b);

There are also foreign keys on col_a and col_b.

Clearly index_on_col_a is no longer needed, but is there a cost or benefit to keeping or deleting it?

My guess is;

  • keeping it will slow down inserts
  • selects using just col_a may be faster if I keep it
Matthew Rudy
  • 16,724
  • 3
  • 46
  • 44

2 Answers2

10

You can drop the index on col_a. PostgreSQL is able to use the combined index if you query on col_a and is also able to use the index if you query on col_a and col_b. These query types can use the combined index:

WHERE col_a = 'val'
WHERE col_a = 'val' AND col_b = 'val'

The combined index cannot be used to query only col_b or an OR junction of col_a and col_b. So the additional index over col_b can make sense if you frequently have queries querying only col_b.

Edit: So: you don't have an advantage creating index_on_col_a, but you have a slower write speed. Drop it.

Andriy M
  • 76,112
  • 17
  • 94
  • 154
ckruse
  • 9,642
  • 1
  • 25
  • 25
0

Even though I agree with the other answer on dropping the index on col_a, sometimes index combinations could be so large that, the index on (col_a, col_b) takes more disk pages, compared to col_a index, which could lead to more I/O on disk. Please use EXPLAIN ANALYZE and EXPLAIN FORMAT=JSON to find the actual rows read, and total cost (represented with equivalent of I/O ops).

If there are more col_b per col_a (per 1 col_a, there are >100 col_b or so), then having the col_a will be helpful. if you are doing range queries, this will be more useful in that case. All these make sense if you really care about very low latency during reads.

yugandhar
  • 580
  • 7
  • 16