0

I'm using postgres full text search for (amoung other things) to provide autocomplete functionality for usernames and tags. However, I'd like autocomplete to match the column value 'dashed-tag-example' against a ts_query like 'dashedtag:*'.

My understanding is that, to do this without duplicating the column in my table I need to create a dictionary along the lines of the simple dictionary that strips charachters like '-'. Is it possible to create such a dictionary using SQL (i.e. something I could put in a rails migration)?

It seems like it should somehow be possible to define a dictionary (or do I need a parser?) that uses postgres's regexp substition functions but I can't seem to find any examples online of how to create a dictionary (parser?) like that. Is this possible? How?

Peter Gerdes
  • 2,288
  • 1
  • 20
  • 28

1 Answers1

1

The dictionary is too late; you would need a different parser, which would require writing C code.

The simple and pragmatic solution is to use replace() to strip the - when you construct the tsvector.

You don't need to create a new column for that, simply search like this:

SELECT ... FROM ...
WHERE to_tsvector('english', replace(col, '-', ''))
      @@ to_tsquery('english', replace('search-string', '-' ''));
Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • But that would mean creating another column correct? Suppose I have a column tagname with entries like 'full-text-search' but I want to match that with a prefix search 'fulltex'. I'd need to create a column like 'normalized_tagname' and search against that no? – Peter Gerdes Feb 25 '22 at 15:13
  • So far my hacky solution has been to use the unaccent extension and created a ruleset that simply deletes the dashes. However, that's annoying in that I have to go install the ruleset in the postgres directory when I deploy. I feel like I'm overcomplicating things but not sure. – Peter Gerdes Feb 25 '22 at 15:15
  • Ohh wait? Can I use replace in the definition of the GIN index? I'm pretty sure I can't (and even if I could then I could no longer have a different search config that didn't ignore - when consulting that column no?). So I'm guessing my unaccent solution is the best I can do. – Peter Gerdes Feb 25 '22 at 15:39
  • 1
    Sure you can use `replace` in an index definition, since it is immutable. You could have two indexes on the column for two different searches. – Laurenz Albe Feb 25 '22 at 15:43
  • I think I'm pretty confused then. If I try to match 'fulltex' against 'full-text-search' then, even if I used replace in my index definition, won't postgres throw out 'full-text-search' as a match when it actually loads the row if the full text search config doesn't say to ignore dashes? I thought indexes only affected speed and never result sets. – Peter Gerdes Feb 25 '22 at 15:49
  • 1
    That is correct. But if you look at my edited answer, I changed the query so that it also uses `replace`. – Laurenz Albe Feb 25 '22 at 15:51
  • Ohh, I see now how it works. Now my sql query is asking to find a column where calling replace on both sides gives a match. Thanks so much that's exactly what I needed. Doubt pg_search gem will recognize that but it's a simple enough search so I'll just drop back to just doing it directly in SQL. Thanks so much!! – Peter Gerdes Feb 25 '22 at 15:55