1

I am using Hasura on top of Postgres to store companies and other information. The source data has the same spelling of records, but in differing cases - for example:

name country
Reckitt Benckiser (Aust) Pty Ltd US
RECKITT BENCKISER (AUST) PTY LTD AU

I have an index on the name of the company:

CREATE UNIQUE INDEX name_company_unique_idx on company (LOWER(name));

When I "upsert" the second record, it throws a uniqueness error, so the index is working correctly. However what I want is for the second record to realise the first record is the same thing, and update the country to AU. If I don't have the index, I get two records for the same entity, in differing cases.

I don't want to store everything in upper or lower case, but am happy for it to take whatever case the first record that was inserted is.

How can I make the upsert process case agnostic?

eshirvana
  • 23,227
  • 3
  • 22
  • 38
user1048175
  • 1,100
  • 3
  • 12
  • 29

2 Answers2

3

You can use a case insensitive collation or the citext extension for the table column and use INSERT ... ON CONFLICT with a normal unique constraint on the column.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • Thanks, I'll check out the collation as from what I can see `citext` calls a `lower()` every time it is called and I'd like to keep things as efficient as possible – user1048175 Jan 11 '21 at 08:12
  • Of course you will have to pay a price for case insensitive search, since you can no longer use binary comparison. – Laurenz Albe Jan 11 '21 at 08:27
2

You didn't specify what version of Postgres you are running, but if it is 12 or above you can add a virtual (generated) column, then put a unique constraint on that column. See example:

alter table table_name add v_company_name text generated always as ( lower(company_name) ) stored;
alter table table_name add constraint company_name_bk unique (v_company_name);  
Shorn
  • 19,077
  • 15
  • 90
  • 168
Belayer
  • 13,578
  • 2
  • 11
  • 22