8

So, I need to find out how to do a fulltext boolean search on a MySQL database to return a record containg the term "C++".

I have my SQL search string as:

SELECT * 
FROM mytable 
WHERE MATCH (field1, field2, field3) 
AGAINST ("C++" IN BOOLEAN MODE) 

Although all of my fields contain the string C++, it is never returned in the search results.

How can I modify MySQL to accommodate this? Is it possible?

The only solution I have found would be to escape the + character during the process of entering my data as something like "__plus" and then modifying my search to accomodate, but this seems cumbersome and there has to be a better way.

zessx
  • 68,042
  • 28
  • 135
  • 158
Bamerza
  • 1,335
  • 1
  • 18
  • 34
  • What the heck is a full-text Boolean search? It sounds awfully a lot like you made up that word. Also, your question is very vague. You need to provide a better description of your problem, what you're trying to do. Otherwise we can't help you. – John Leidegren Feb 25 '09 at 06:52
  • @John: You're being very harsh. I had to refrain myself from linking to lmgtfy, so here's an explanation from MySQL instead: http://dev.mysql.com/doc/refman/5.1/en/fulltext-boolean.html – A. Rex Feb 25 '09 at 07:03
  • @A. Rex - I'm evil (in a sense) my point my come across as harsh but the question is lacking in regard to how you would phrase a good question. As such we (stack overflow people) are unable to provide a good answer based on insufficient information. – John Leidegren Feb 25 '09 at 07:20
  • @A. Rex - Further to exemplify my point, in many cases, the question is wrong. But if you phrase your question well, people with enough experience will be able to tell that your working this the wrong angle. And that's gonna provide you with better insight in the future. – John Leidegren Feb 25 '09 at 07:21
  • 6
    The term would be well known to MySQL users who have enough knowledge to answer the question. – bobince Feb 25 '09 at 07:32
  • 1
    @John Fulltext boolean is a good description of a well known MySQl search type. – Mazatec Feb 07 '11 at 10:12
  • @Myself - Well, this truly makes me feel like an idiot and rightfully so... – John Leidegren Feb 14 '11 at 07:26

4 Answers4

8

How can I modify MySQL to accommodate this?

You'll have to change MySQL's idea of what a word is.

Firstly, the default minimum word length is 4. This means that no search term containing only words of <4 letters will ever match, whether that's ‘C++’ or ‘cpp’. You can configure this using the ft_min_word_len config option, eg. in your my.cfg:

[mysqld]
ft_min_word_len=3

(Then stop/start MySQLd and rebuild fulltext indices.)

Secondly, ‘+’ is not considered a letter by MySQL. You can make it a letter, but then that means you won't be able to search for the word ‘fish’ in the string ‘fish+chips’, so some care is required. And it's not trivial: it requires recompiling MySQL or hacking an existing character set. See the section beginning “If you want to change the set of characters that are considered word characters...” in section 11.8.6 of the doc.

escape the + character during the process of entering my data as something like "__plus" and then modifying my search to accomodate

Yes, something like that is a common solution: you can keep your ‘real’ data (without the escaping) in a primary, definitive table — usually using InnoDB for ACID compliance. Then an auxiliary MyISAM table can be added, containing only the mangled words for fulltext search bait. You can also do a limited form of stemming using this approach.

Another possibility is to detect searches that MySQL can't do, such as those with only short words, or unusual characters, and fall back to a simple-but-slow LIKE or REGEXP search for those searches only. In this case you will probably also want to remove the stoplist by setting ft_stopword_file to an empty string, since it's not practical to pick up everything in that as special too.

bobince
  • 528,062
  • 107
  • 651
  • 834
  • @bobince: (Re the discussion in the comments above, thank you for answering this easily-understandable question in a clear manner.) Out of curiosity, how would you do stemming with this approach? Replace every word in the auxiliary table with its stem? – A. Rex Feb 25 '09 at 09:05
  • Essentially yes (processing the words in search queries in the same way of course). Typically you'd use an existing suffix-stripping stemmer library for your preferred languages. (For both values of ‘language’; see eg. Porter's algorithm for English in many programming languages.) – bobince Feb 25 '09 at 09:34
1

From http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html:

A phrase that is enclosed within double quote (“"”) characters matches only rows that contain the phrase literally, as it was typed.

This means you can search for 'C++' using this query:

SELECT * 
FROM mytable 
WHERE MATCH (field1, field2, field3) 
AGAINST ('"C++"' IN BOOLEAN MODE)
zessx
  • 68,042
  • 28
  • 135
  • 158
0

solution::

change my.ini file

put these two lines

ft_min_word_len = "1"
ft_stopword_file =""

below

[mysqld]

than savve file and restart mysql server.

my.ini file will sharewd by all. so can we do changes in my.ini file for some session only.?

zessx
  • 68,042
  • 28
  • 135
  • 158
0

Usually escaped characters are used in the query not in the database data. Try escaping each "+" in your query.

Paxic
  • 1,760
  • 16
  • 28