15

I'm doing a Sphinx search but turning up some really weird results. Any help is appreciated.

So for example if I type "50", I get:

  • 50 Cent
  • 50 Lions
  • 50 Foot Wave, etc.

This is great, but when I search "50 Ce", I get:

  • Ryczące Dwudziestki
  • Spisek
  • Bernhard Gal
  • Cowabunga Go-Go

And other crazy results. Also when I search for "50 Cent", the correct result is at the top, but then random results below. Any ideas why?

PHP code:

$query = $_GET['query'];

if (!empty($query))
{
 $sphinx->SetMatchMode(SPH_MATCH_ALL);
 $sphinx->AddQuery($query, 'artists');
 $sphinx->AddQuery($query, 'variations');

 $sphinx->SetFilter('name', array(3));

 $sphinx->SetLimits(0, 10);

 $result = $sphinx->RunQueries();

 echo '<pre>';

 switch ($result)
 {
  case false:
   echo 'Query failed: ' . $sphinx->GetLastError() . "\n";
   break;
  default:
   if ($sphinx->GetLastWarning())
   {
    echo 'WARNING: ' . $sphinx->GetLastWarning() . "\n";
   }

   if (is_array($result[0]['matches']) && count($result[0]['matches']))
   {
    foreach ($result[0]['matches'] as $value => $info)
    {
     $artist = artistDetails($value);
     echo $artist['name'] . "\n";
    }
   }
 }
}

Sphinx Index and Source:

source artists
{
 type     = mysql

 sql_host    = localhost
 sql_user    = user
 sql_pass    = pass
 sql_db     = db
 sql_port    = 3300

 sql_query    = \
  SELECT \
    id, name \
  FROM artists;

 #UNIX_TIMESTAMP(time)
 #sql_attr_uint   = group_id
 #sql_attr_timestamp  = time

 sql_query_info   = SELECT id,name FROM artists WHERE id=$id
}

index artists
{
 source     = artists
 path     = /var/sphinx/artists
 docinfo     = extern
 charset_type   = utf-8
}
OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
James
  • 5,942
  • 15
  • 48
  • 72

1 Answers1

21

You need to use the min_prefix_len index config option to tell sphinx that you want it to index and match on partial words. You'll probably also need to set enable_star to 1

http://www.sphinxsearch.com/docs/current.html#conf-min-prefix-len

index artists
{
 source     = artists
 path     = /var/sphinx/artists
 docinfo     = extern
 charset_type   = utf-8
 min_prefix_len   = 2
 enable_star   = 1
}

after enabling prefix indexing you'll be able to search for stuff like "50 Ce*" to get partial word matches. If you want partial word matches to be allowed without requiring your users to know about adding the * themselves you'll probably have to modify the search string programmatically before passing it to sphinx.

Ty W
  • 6,694
  • 4
  • 28
  • 36
  • 1
    This is fantastic. I'm not sure if this is only true of newer versions of Sphinx or not, but you may not need to set `enable_star = 1` at all. I didn't modify from the default and my queries worked just the same. Also, for most use cases (not this one) I think having a `min_prefix_len` less than 4 may be unnecessary. If anyone can comment on performance here, I'd be most grateful. – Josh Smith Sep 29 '10 at 06:23
  • Ignore what I said about `min_prefix_len`. For search autocompletion, I'm finding it does wonders to have it set to 2. – Josh Smith Oct 07 '10 at 21:05
  • hi i am new to sphinx using linux server.But iam getting error no :111. – Karthik Apr 20 '11 at 12:57
  • 1
    @KSReddy : Error Code 111 is for connection error. Check with the port you provided to Sphinx might be wrong. Do check with it. usually is you are using Sphinx API then use 9312. I too bugged my head for around 3 hours, when I was working on it. – Nishant Shrivastava May 23 '11 at 12:51