31

I have a database with several fields :

word_id — INTEGER PRIMARY_KEY
word — TEXT
...

And ~150k rows. Since this is a dictionary, I'm searching for a word with mask 'search_string%' using LIKE. It used to work, taking 15ms to find matching rows. The table has an index for a field 'word'.

I modified the table (some fields which are out of the scope) and it's taking 400ms to execute a query, so I understand that as it fails to use index now. Straightforward query with = instead of LIKE shows 10ms result. What's happening here?

user4157124
  • 2,809
  • 13
  • 27
  • 42
nikans
  • 2,468
  • 1
  • 28
  • 35
  • i think so, you probably want to look at b-tree as b-tree are efficient in range query and like comparison. – Jasonw Dec 21 '11 at 02:49
  • Hmm, doesn't index suppose the creation of b-tree? You mean, I should create a b-tree by myself? – nikans Dec 21 '11 at 02:55
  • Is the % always only at the end of your search term, never the start ? – Michael Low Dec 21 '11 at 02:57
  • Yeah, always in the beginning. – nikans Dec 21 '11 at 02:58
  • 1
    Maybe try dropping and recreating the index? – P Varga Dec 21 '11 at 03:04
  • there are a few index type that can be specified during creating an index. example, in mysql it has, b-tree, hash and r-tree. but since you are using sqlite, am not familiar with it and so sqlite expert can comment further. – Jasonw Dec 21 '11 at 03:07
  • Thank you. I've already tried recreating an index, making several indexes for several fields and so on. I remember, that several months ago, when I was creating these indexes (that worked), I've been using an unusual query like `CREATE INDEX idxname ON table(words) __SOMETHING_HERE__` and I can't remember what was it on the end of the query. Maybe I had a dream about it, because it's highly unlikely )) – nikans Dec 21 '11 at 03:30
  • 3
    You should try to use `EXPLAIN QUERY PLAN` to see what's going on with your query. – Evgeny Shurakov Dec 21 '11 at 04:57
  • By the way, how would I do this on iPhone? I mean, just type `EXPLAIN QUERY PLAN SELECT ...` and watch what's up in console or what? – nikans Dec 21 '11 at 07:20
  • A good candidate for the clustered index would be `(word, word_id)` if that is possible in SQLite. – onedaywhen Dec 21 '11 at 14:30
  • nah, it appears to be 10 times slower with clustered index – nikans Dec 23 '11 at 00:24

2 Answers2

41

An index cannot safely be used in this case. A naive implementation would transform this:

... WHERE word LIKE 'search_string%'

into

... WHERE word >= 'search_string' AND word < 'search_strinh'

by incrementing the last character of the search string. The greater-than and less-than operators can use an index, where LIKE cannot.

Unfortunately, that won't work in the general case. The LIKE operator is case-insensitive, which means that 'a' LIKE 'A' is true. The above transformation would break any search string with capitalized letters.

In some cases, however, you know that case sensitivity is irrelevant for a particular column, and the above transformation is safe. In this case, you have two options.

  1. Use the NOCASE collating sequence on the index that covers this particular field.
  2. Change the behavior of the LIKE operator program-wide by running PRAGMA case_sensitive_like = ON;

Either of these behaviors will enable SQLite to transparently do the above transformation for you; you just keep using LIKE as always, and SQLite will rewrite the underlying query to use the index.

You can read more about "The LIKE Optimization" on the SQLite Query Optimizer Overview page.

Vladimir Panteleev
  • 24,651
  • 6
  • 70
  • 114
BJ Homer
  • 48,806
  • 11
  • 116
  • 129
  • 1
    The syntax of using `COLLATE NOCASE` with index is `CREATE INDEX index_name ON table_name (column_name COLLATE NOCASE);` if anyone interested. – AnT Oct 14 '21 at 11:33
2

GLOB prefix* workaround

This method does use the index, and so it is a good workaround for LIKE prefix% that doesn't require modifying the table collation:

SELECT * FROM mytable WHERE mycolumn GLOB 'myprefix*'

See also: Can an index on a text column speed up prefix based LIKE queries?

Tested on Sqlite 3.40.1, Ubuntu 23.04 on a TEXT column of a large-ish database I have lying around with 750 M rows. Query results are almost instantaneous.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985