We are using a custom text search configuration for searching in german texts for a proper support of compound words.
The dictionary can be found here: http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ (ispell-german-compound.tar.gz).
The dicts got converted to UTF8 and I used the following script to add the configuration to the database:
DROP TEXT SEARCH DICTIONARY IF EXISTS german_bon_ispell CASCADE;
DROP TEXT SEARCH DICTIONARY IF EXISTS german_bon_stem CASCADE;
CREATE TEXT SEARCH CONFIGURATION german_bon (copy=german);
CREATE TEXT SEARCH DICTIONARY german_bon_stem (
TEMPLATE = snowball,
Language = german,
StopWords = german
);
CREATE TEXT SEARCH DICTIONARY german_bon_ispell (
TEMPLATE = ispell,
dictfile = german,
afffile = german,
StopWords = german
);
ALTER TEXT SEARCH CONFIGURATION german_bon
ALTER MAPPING FOR
asciiword,word,numword,numhword,hword_asciipart,hword_part,hword_numpart
WITH german_bon_ispell, german_bon_stem;
The dictionary itself works great, but on every new connection/session the first query using this configuration takes 1-2 secounds. Every following ~1-3ms.
This effect is also observable for the english dictionary, but not that drastic:
db=# \timing
Timing is on.
db=# select ts_debug('english', 'Book');
ts_debug
-----------------------------------------------------------------------
(asciiword,"Word, all ASCII",Book,{english_stem},english_stem,{book})
(1 row)
Time: 6,977 ms
db=# select ts_debug('english', 'Book');
ts_debug
-----------------------------------------------------------------------
(asciiword,"Word, all ASCII",Book,{english_stem},english_stem,{book})
(1 row)
Time: 2,258 ms
db=# select ts_debug('german_bon', 'Buch');
ts_debug
---------------------------------------------------------------------------------------------------
(asciiword,"Word, all ASCII",Buch,"{german_bon_ispell,german_bon_stem}",german_bon_ispell,{buch})
(1 row)
Time: 916,286 ms
db=# select ts_debug('german_bon', 'Buch');
ts_debug
---------------------------------------------------------------------------------------------------
(asciiword,"Word, all ASCII",Buch,"{german_bon_ispell,german_bon_stem}",german_bon_ispell,{buch})
(1 row)
Time: 1,240 ms
db=#
The only work-around I'm currently aware of is the use of persistent connections/connection pooling and we are using pgbouncer for that. But that introduces some other problems with out client (PHP>PDO>Doctrine) which looks like a caching issue.
Is there any way to reduce this "startup time"? It kind of looks like the configuration is loaded/created for each new connection what just doesn't seem reasonable.