2

image you have application like this : 1 DB table, few int fields, few small varchar fields, and about 10 TEXT fields (contents variable - some data about 50 chars long, most about 100-200, some about 1000, very few more than 1000). Row count is in x0 000 - x00 000. Now, i need effective way to query like this (meta-language):

SELECT (1 if textfield1 LIKE %param1% ELSE 0) as r1,(1 if textfield2 LIKE %param2% ELSE 0) as r2, ... etc, for most of the text fields in 1 query typically (it is dynamic - may be 2 of them included, may be all of them).

Now the question - what is better for me, MySQL or MSSQL (probably express while possible,upgrade to full if really needed) ?

I know that MySQL have nice text indexes, which you have set on custom number of first characters, so i can balance it for the typical scenario (like this : http://fernandoipar.com/2009/08/12/indexing-text-columns-in-mysql/)

MSSQL has only full text indexing, which i have no experience with. Note that i do NOT need features like words proximity or similar words (run = ran; some stemming would be nice, but because data are multilingual it is impossible anyway). I need just common LIKE %word% system, thats all. And i also have to be able to find short substrings (2 chars).

Virtually the goal is to run as many as possible of these queries per hour/day (there wont be enough results, never ever, because they should be refreshed as often as possible), so think of this kind of efficiency as requirement :)

Thanx!

UPDATE: well aparently there is no way to use index for optimizing LIKE %foo% queries. So the new question is : is there any other way to speed up this type of queries ? (please omit things like "buy more ram or SSD" :)

rouen
  • 5,003
  • 2
  • 25
  • 48

2 Answers2

3

LIKE '%foo%' expression cannot be optimized in any RDBMS.

You need fulltext indexes in mysql or in sql server

I need just common LIKE %word% system

Then choose any DBMS you want, because all they will suck on such clause ;-)

zerkms
  • 249,484
  • 69
  • 436
  • 539
  • are you sure about that ? have you looked at the link i included about mysql ? seems to me that the guy pretty much optimized a query just like that with indexes on text fields – rouen Feb 08 '11 at 11:22
  • 1
    @rouen: there is **significant** difference between `LIKE 'foo%'` and `LIKE '%foo%'`. The latter **cannot** be optimized with indexes. By definition. Read about B-Trees. – zerkms Feb 08 '11 at 11:24
2

Today many applications use an external index and search engine.

Have a look at http://lucene.apache.org/

cherouvim
  • 31,725
  • 15
  • 104
  • 153
  • many thanks for this hint, i took a look and very quickly prototyped something and Lucene.NET definetly seems like the way to go.. freakin fast and flexible, i am pretty amazed actually :) – rouen Feb 09 '11 at 14:46