0

I have a table called 'uploads'. The columns are id, date, last_update, description, tags. And I added the fulltext index 'description_tags' which includes those both columns description and tags. The table format is MyISAM. I am using USBWebserver if this could be the problem.

Now I try to select the uploads which have the provided GET variables in them.

MATCH (description,tags) AGAINST ("'.urldecode($_GET['tags']).'")

But the problems with the results I have are: There is no result when i search for 'lorem', but there is one when I add IN BOOLEAN MODE to the query. The 'Lorem' is actually in the description column. Searching keywords in the tags column don't have that problem. And even more interesting/annoying when I search for 'moree', which is also in the description table, I get results with and without the boolean mode...

What is the problem here? I don't see it.

Update

if ( $upload_display_sort == 'popular' )//sort by popularity

$query =    //select 'u' (a new defined variable?)
            'SELECT u.* '.
            //from the following table
            'FROM '.
            //uploads table as variable u
            'uploads u '.
            //join the following table
            'LEFT OUTER JOIN '.
            //select the column upload_id, count all columns into variable 'num' ???
            '(SELECT upload_id, COUNT(*) AS num '.
            //from the favorites table as variable f
            'FROM favorites f '.
            //and group it by the upload_id in the favorites table
            'GROUP BY upload_id) '.
            //what does this do? combine rows where the u.id == f.upload_id ???
            'f ON u.id = f.upload_id';

else $query = 'SELECT * FROM uploads';//select all from uploads



$query .=   ' WHERE online_state = 1';//select all online uploads



//FILTER FAVORITES

if  ($upload_display_sort == 'favorites' and !empty($favorites_ids))

$query .= ' AND id IN ('.implode(',', $favorites_ids).')';//returns 1,2,3,4,5

elseif  ($upload_display_sort == 'favorites' and empty($favorites_ids))

$query .= ' AND id IN (0)';//no upload id is 0



//FILTER SEARCH

if  (   isset($_GET['tags'])    )

$query .= ' AND MATCH (description,tags) AGAINST ("'.urldecode($_GET['tags']).'" IN BOOLEAN MODE)';



//FILTER DATE

if  (   isset($_GET['timeframe']    )

    and (   $_GET['timeframe'] == '3days'

        or  $_GET['timeframe'] == '2weeks'

        or  $_GET['timeframe'] == '3months')    )

{

    $end_date = time();//current time in unix format

    switch  (   $_GET['timeframe']  )

    {
        case '3days':   $start_date = strtotime('-3 days',$end_date);   break;
        case '2weeks':  $start_date = strtotime('-2 weeks',$end_date);  break;
        case '3months': $start_date = strtotime('-3 months',$end_date); break;
        default:        $start_date = strtotime('');    break;//1970-01-01 01:00:00
    }

    $end_date = date("Y-m-d H:i:s", $end_date);//current time in mysql format

    $start_date = date("Y-m-d H:i:s", $start_date);//end time in mysql format

    $query .= ' AND last_update BETWEEN "'.$start_date.'" AND "'.$end_date.'"';

}



//ORDER

$query .= ' ORDER BY';

if ( $upload_display_sort == 'popular' )//sort by popularity

$query .= ' f.num DESC,';

$query .= ' last_update DESC, id DESC';
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459

1 Answers1

0

Your test data may be incomplete. Note the following from the docs:

Words that are present in 50% or more of the rows are considered common and do not match.

Yet BOOLEAN MODE searches differ in that:

They do not use the 50% threshold.

I'm guessing "lorem" appears in 50% or more of the rows.

BOOLEAN MODE full text searches don't automatically sort by decreasing relevance. You'll need to sort them yourself like this:

SELECT *, MATCH (description,tags) AGAINST ("lorem") AS relevance
FROM uploads
WHERE MATCH (description,tags) AGAINST ("lorem" IN BOOLEAN MODE)
ORDER BY relevance DESC
Marcus Adams
  • 53,009
  • 9
  • 91
  • 143
  • As I do not want any words to be "common" and excluded from search, I guess I have to use BOOLEAN MODE then. I tried actually adding "amet" to one row and it is being found - so I guess it is because of the 50% thing you are mentioning. But I just edited some dummy upload entries. I added "four" and "seven" into their descriptions. Both don't become results when searching for them. Tough "nice" being in a description and searched works. Are there something like keywords not being searched/excluded? –  Mar 17 '14 at 20:15
  • @handle, yes there is the list of [stop words](https://dev.mysql.com/doc/refman/5.1/en/fulltext-stopwords.html). Both "four" and "seven" is in there. If this is your server, you can [configure that list](http://stackoverflow.com/questions/796688/how-to-reset-stop-words-in-mysql). If it's a shared server (shared hosting), you probably won't be able to. – Marcus Adams Mar 17 '14 at 20:23
  • I cannot change them - it is shared hosting - I guess I can live without those words. But what I just read about the boolean mode search: They do not automatically sort rows in order of decreasing relevance. The row with the highest relevance is the one that contains “MySQL” twice, but it is listed last, not first. I would prefer the other way around. Can i use the not boolean search and remove that 50% common words exclusion? Or invert the boolean search result? –  Mar 17 '14 at 20:32
  • @handle, BOOLEAN fulltext searches don't automatically order the results by relevance. See my updated answer for the solution, though I'm not sure how accurate the relevance score is in BOOLEAN mode. – Marcus Adams Mar 17 '14 at 21:02
  • ok, this would be great if this works, but actually I have this big complexe query already :) I don't know how I get this in there. I can simply add the bottom stuff (WHERE MATCH (description,tags) AGAINST ("lorem") ORDER BY relevance DESC) in there. But I am not sure how I put in the SELECT part there you are suggesting. –  Mar 18 '14 at 09:58
  • I have updated the full query as you see on top. But I get the error "Unknown column 'relevance' in 'order clause'". –  Mar 18 '14 at 10:10
  • I have tried this query which delivers results without error `SELECT * , MATCH (description,tags) AGAINST ("moree" IN BOOLEAN MODE) AS relevance FROM uploads WHERE online_state = 1 AND MATCH (description,tags) AGAINST ("moree" IN BOOLEAN MODE) ORDER BY relevance DESC, last_update DESC, id DESC` but it does not order the results as expected. I have three database entries with the word "moree" in their descriptions, and the one which has it actaully three times in it is not at first place, but inbewteen. –  Mar 18 '14 at 10:47
  • I have tried this query `SELECT MATCH (description,tags) AGAINST ("moree" IN BOOLEAN MODE) AS relevance FROM uploads WHERE online_state = 1 AND MATCH (description,tags) AGAINST ("moree" IN BOOLEAN MODE) ORDER BY relevance DESC, last_update DESC, id DESC` and it gives me back a column with the relevances. But the relevance is "1" everywhere, but i have used the search word in a different amount in the table entries, so I guess this can't be ordered? –  Mar 18 '14 at 11:55
  • If I cannot order it by relevancy, because every result is "1", then I simply remove the relevancy ordering. I still have ordering by date and id applied. I guess exact word results are always 1 and only substrings are something like 0.65? –  Mar 18 '14 at 12:27