24

How do you espace special characters in string passed to to_tsquery? For instance, this kind of query:

select to_tsquery('AT&T');

Produces:

NOTICE:  text-search query contains only stop words or doesn't contain lexemes, ignored

 to_tsquery 
------------

(1 row)

Edit: I also noticed that there is the same issue in to_tsvector.

Konrad Garus
  • 53,145
  • 43
  • 157
  • 230

3 Answers3

10

A simple solution is to create the tsquery as follows:

select $$'AT&T'$$::tsquery;

You can make more complex queries:

select $$'AT&T' & Phone | '|Bang!'$$::tsquery;

See the text search docs for more.

David Weber
  • 1,965
  • 1
  • 22
  • 32
  • 1
    Caution: this won't apply the various dictionary transformations performed by `to_tsquery`, such as stemming, synonym grouping and stopword removal. This may mean that some queries will never find a match. – Ben Whitmore Aug 29 '22 at 06:53
8

I found this comment very useful that uses the plainto_tsquery('AT&T) function https://stackoverflow.com/a/16020565/350195

Community
  • 1
  • 1
peralmq
  • 2,160
  • 1
  • 22
  • 20
  • 11
    Is there anyway to support partial matching e.g. adding :* onto the end with this? – Colin D Aug 18 '17 at 02:15
  • @ColinD I'm sorry, but have you found a way to do this? – Mogost Aug 05 '20 at 11:22
  • 1
    I left the project that I was using for this but I don't remember achieving the solution, my final query was like this `query = "SELECT o.ortholog_id, g.gene_id, g.species_id, g.description, g.symbol FROM search_index s JOIN orthologs o on o.ortholog_id = s.ortholog_id JOIN genes g on o.gene_id = g.gene_id WHERE s.document @@ to_tsquery('english', ?search) ORDER BY ts_rank(s.document, to_tsquery('english', ?search)) DESC, o.ortholog_id LIMIT 300";` and if I remember correctly the user could manually add * to the query – Colin D Aug 05 '20 at 13:42
2

If you want 'AT&T' to be treated as a search word, you're going to need some customised components, because the default parser splits it as two words:

steve@steve@[local] =# select * from ts_parse('default', 'AT&T');
 tokid | token 
-------+-------
     1 | AT
    12 | &
     1 | T
(3 rows)
steve@steve@[local] =# select * from ts_debug('simple', 'AT&T');
   alias   |   description   | token | dictionaries | dictionary | lexemes 
-----------+-----------------+-------+--------------+------------+---------
 asciiword | Word, all ASCII | AT    | {simple}     | simple     | {at}
 blank     | Space symbols   | &     | {}           |            | 
 asciiword | Word, all ASCII | T     | {simple}     | simple     | {t}
(3 rows)

As you can see from the documentation for CREATE TEXT PARSER this is not very trivial, as the parser appears to need to be a C function.

You might find this post of someone getting "underscore_word" to be recognised as a single token useful: http://postgresql.1045698.n5.nabble.com/Configuring-Text-Search-parser-td2846645.html

araqnid
  • 127,052
  • 24
  • 157
  • 134