0

Currently trying to use Sphinx to search through medicine names, and because of the way the US structure medical names its all medicine_type in the database, and there's no way of changing that going forward.

If for instance I search medicine_type Sphinx will find it easily, but if I type just the medicine name to bring up all types of that medicine, it won't.

I've tried enabling expand_keywords = 1 to no avail.

Is there anything I can do to make Sphinx do what I need it to?

Jonah Hart
  • 35
  • 5

1 Answers1

0

Well the default charset_table includes underscore as a word character

http://sphinxsearch.com/docs/current.html#conf-charset-table

# default are English and Russian letters
charset_table = 0..9, A..Z->a..z, _, a..z, \
    U+410..U+42F->U+430..U+44F, U+430..U+44F, U+401->U+451, U+451

A simple solution, might just be to remove it from charset table (ie define charset_table explicitly without _ in the list!)

# custom charset without understore
charset_table = 0..9, A..Z->a..z, a..z, \
    U+410..U+42F->U+430..U+44F, U+430..U+44F, U+401->U+451, U+451

... then underscore would be taken a word separator (like spaces) and mean words would match.

(could also maybe remove some of the russian chars if don't need them. And investigate if other letters want to index)


Might also at least want to consider blend_chars http://sphinxsearch.com/docs/current.html#conf-blend-chars although in this situation don't think it particularly helps.

barryhunter
  • 20,886
  • 3
  • 30
  • 43