1

Here I am creating table product_feature_text, having a 1:N relation with table product. As the application must support several user languages, a lang_code column is added to segment english texts from other language texts.

As I want to present the product features alphabetically ordered in every language, I have created four partial indexes with their specific collate. It is expected that all products features have title in all of the four languages, i.e., there will be 25% of rows with lang_code = 'ES', for example.

This is an oversimplification of the real case but enough to depict the situation.

create table product_feature_text (
  id          bigint generated by default as identity primary key,

  -- reference to the parent product
  product_id  bigint not null,

  -- language dependent columns
  lang_code   char(2),
  title       varchar,

  foreign key (product_id) references product (id)
);

create index on product_feature_text (title collate "en-US") where lang_code = 'EN';
create index on product_feature_text (title collate "es-ES") where lang_code = 'ES';
create index on product_feature_text (title collate "fr_FR") where lang_code = 'FR';
create index on product_feature_text (title collate "de_DE") where lang_code = 'DE';

Is this the best index approach for the case?

Addendum from a comment: a typical query would be

select text
from product_feature
where product_id = 1024
   and lang_code = 'FR'
order by title collate "fr_FR"

where product_id could be anything.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
coterobarros
  • 941
  • 1
  • 16
  • 25

1 Answers1

2

It depends on the intended use of the indexes.

If you want to use them for

SELECT ... FROM product_feature_text
WHERE lang_code = 'EN' AND ...
ORDER BY title COLLATE "en-US";

your indexes might be useful.

Also, if your query looks like

WHERE product_feature_text > 'bhd'  COLLATE ...

it might help.

However, for most cases that I can envision, a single index whose collation doesn't matter would be better.

For the query in the addendum, the perfect index would be:

CREATE INDEX ON product_feature (product_id, title COLLATE "fr_FR")
   WHERE lang_code = FR';
Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • Thanks again @Laurenz Albe. Yes, a typical query would be `select text from product_feature where product_id = 1024 and lang_code = 'FR' order by title collate "fr_FR"` being `1024` any other valid product_id – coterobarros Sep 08 '20 at 15:34
  • For those 4 languages, `collate` would not help too much. Maybe the ß in deustch and some ç in french – coterobarros Sep 08 '20 at 15:36
  • 1
    I have added an index suggestion to the answer. If `product_id` is very selective, I'd prefer a simple indexon just that column. Having many indexes is bad for planning time. – Laurenz Albe Sep 09 '20 at 05:51
  • A typical product may have 5-30 features. This means a total of 20-120 rows in `product_feature` for a single product in the four languages. I understand that you suggest that scanning those rows is worth the pain of having 4 partial indexes. – coterobarros Sep 09 '20 at 08:17
  • I will be testing this empirically. Thank you! – coterobarros Sep 09 '20 at 08:18