How do I use output of a nested query in create index statement on postgres

Question

I have a situation where I want to add partial unique index in a table. This index has to be applied only to a subset of rows and rows that will be added henceforth. There are multiple databases where I have to put this index. The problem is that I have to use the data in the table of each database to determine the 'where' clause.

The following query gives an error:

create unique index user_car_unique 
ON car_usage(user_id,car_id) 
where  date_created > 
(select Max(date_created) from car_usage where id > 10)

ERROR which I get is:

ERROR: cannot use subquery in index predicate
SQL state: 0A000
Character: 98

However the following query works:

create unique index user_car_unique 
ON car_usage(user_id,car_id) 
where  date_created > '2014-08-12'

Is there a way I can get around the "cannot use subquery in index predicate" error?

My project uses Grails. I will be writing a database migration (DBM) script to add this index which will execute during the start up time.

The reason I need this:

I have a table with duplicate entries. These entries are there because of a business requirement. To implement it I had to insert duplicates in the table. Now that requirement is changed. There should be NO duplicates in the table 'henceforth' and the old duplicates should remain as they are. To enforce that at the DB layer I want to put unique constraint. I cannot put apply constraint as there are old duplicates. Hence I opted to use partial keys. Is there any other way I can achieve this?

You can't create such an index (and what would be the use of that?) — , Sep 16 '14 at 14:46
The index creation works when I provide a value in where clause. Imagine a case where business requirement changed and we do not allow any duplicate checkouts. There are other zones with their own db which have different "date of effect" for this requirement. — matuda, Sep 16 '14 at 14:52
You cant create partial index, that uses mutable expression as a filter. This cant work. — Ihor Romanchenko, Sep 16 '14 at 15:06
Thank you.. I found the similar answers on other blogs/posts.. I fear the last resort will be to ship different scripts for different database... — matuda, Sep 16 '14 at 16:06
@matuda You can create an immutable function around an SQL query and use it in index. Still you must be shure that SQL query results are immutable. — Ihor Romanchenko, Sep 16 '14 at 18:49
The reason it isn't working is what happens if a new INSERT changes MAX(date_created). What is the behavior supposed to be? Should previously indexed entries be de-indexed? Even more interesting, suppose you DELETE the row with the MAX(date_created). Are we now supposed to re-index the entire table? I am wondering _why_ you want this behavior. Why will it not work to index all the rows? — Andrew Lazarus, Sep 17 '14 at 00:43
Thanks @AndrewLazarus Please see my edit where I have tried to answer "why I need this type of change" — matuda, Sep 17 '14 at 01:43
In this case, you have a hard date to use as a cutoff, namely, the date the business logic changed. I hope your DB supports the `COMMENT` command, because whoever works on this after you will be bewildered. — Andrew Lazarus, Sep 17 '14 at 02:31

score 1 · Answer 1 · answered Sep 17 '14 at 02:42

Since you've stated that this is due to a change in business requirements and the issue that causes the duplicates will no longer occur, I would suggest instead moving the duplicate entries into an archive table which inherits from the main table.

Then queries against the main table will also drop down into the inherited table, unless you use select ... from only, and you can have a regular unique index on the main table once you move the duplicate entries out into the archive table.

So if you had something like:

create table foo
(
  table_id serial,
  id int not null,
  val text not null,
  created timestamp without time zone default now() not null
);


insert into foo (id, val, created)
  values (1, 'one', '2014-09-14 09:00:00'),
         (2, 'two', '2014-09-14 11:00:00'),
         (2, 'two', '2014-09-14 12:00:00'),
         (2, 'two', '2014-09-14 13:00:00'),
         (3, 'three', now());


create table foo_archive
(
) inherits (foo);

You could then do something like this to remove the duplicates from the main table (assuming you have some sort of unique identifier to use that goes beyond the unique index you were trying to add; if not, you can choose how to decide which one to keep -- earliest created time, or something of that nature; whatever fits your use case best with the data you have):

create temp table min_table_ids as
select id, val, min(table_id) as min_table_id
from foo
group by id, val
order by id, val;

insert into foo_archive
select *
from foo
where table_id NOT IN (select min(table_id)
                       from foo
                       group by id, val
                       order by id, val
                      );

delete from only foo
where table_id NOT IN (select min_table_id
                       from min_table_ids
                      );

select *
from only foo
order by id, val;

select *
from foo_archive
order by id, val;

Here is the related sqlfiddle.

I have used this general technique in the past to separate different groups of otherwise similar data and find that it works pretty well.

It also has the side benefit of making your main table smaller and easy to splice the older data out of a query if you want to, via select ... from only or keep the older data via a regular select.

Then, once you have the duplicates removed, you can enable a regular unique constraint (instead of a unique index) on the main table, i.e.

alter table foo
  add constraint foo_uniq_key unique (id, val);

How do I use output of a nested query in create index statement on postgres

1 Answers1

Linked