0

I am using tools like to_tsquery and to_tsvector in PostgreSQL to perform a full text search but postgresql has a limited number of supported languages by default. I need my database to support the Georgian language to perform searches in that language.

Where can I find/download the text configuration for that language and how do I apply this configuration to my postgresql instance?

Is there any guide that helps to create this kind of configuration using language dictionaries?

EDIT:

Ok, here is what i got so far.
It was pointed in the comments that I should look for Snowball or Ispell dictionaries. I found a dictionary for the georgian language by simply googling "georgian" "hunspell" but now I have a problem creating the text search configuration using this dictionary.
I created a dictionary in postgresql using

create text search dictionary georgian_hunspell (
template = ispell,
DictFile = ka_GE,
AffFile = ka_GE
);

and tested it using

select ts_lexize('georgian_hunspell', 'ვაშლი');

which works fine. But creating a configuration doesn't help. I tried doing this:

CREATE TEXT SEARCH CONFIGURATION georgian_hunspell_configuration (parser = default);

and then this:

ALTER TEXT SEARCH CONFIGURATION georgian_hunspell_configuration
ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,word, hword, hword_part
WITH georgian_hunspell;

but to no avail. I tried testing it using this:

select * from ts_debug('georgian_hunspell_configuration', 'ვაშლი');

The result is

 alias |  description  | token | dictionaries | dictionary | lexemes 
-------+---------------+-------+--------------+------------+---------
 blank | Space symbols | ვაშლი | {}           |            | 
(1 row)

It says Space symbols because no parser could extract the tokens from the input? If so, why? Because of the different used alphabet of this language? Should I write my own parser?

How can I make this work?

  • The documentation for [Snowball](https://www.postgresql.org/docs/current/textsearch-dictionaries.html#TEXTSEARCH-SNOWBALL-DICTIONARY) and [Ispell](https://www.postgresql.org/docs/current/textsearch-dictionaries.html#TEXTSEARCH-ISPELL-DICTIONARY) dictionaries has some pointers. – Laurenz Albe May 31 '23 at 16:20

0 Answers0