SQLite: Optimize table scans

Question

In my table I have some columns that have no index on them, searching for a value in those columns can take very long because SQLite does a full table scan.

In my specific case the row I'm looking for (the values are unique) is almost always among the most recently inserted.

I suspect that SQLite starts from the oldest (first) row when doing the scan, is there any way to instruct SQLite to do the table-scan in reverse order?

UPDATE: I found this in the changelog:

The optimizer will now scan tables in the reverse if doing so will satisfy an ORDER BY ... DESC clause.

So maybe this means I can just do add an ORDER BY clause, to speed it up.

Order by DESC is not going to cause SQLite to stop scanning rows when it finds a match. If your values are unique in that column, it will continue to scan, finding no additional matches after it has found the first match. See my LIMIT suggestion below. — Tim, Sep 25 '11 at 19:35

score 2 · Accepted Answer · answered Sep 25 '11 at 20:46

2

The solution was:

ORDER BY rowid DESC LIMIT 1

It made the lookups lightning fast!

answered Sep 25 '11 at 20:46

Maestro

9,046
15
83
116

Tim · Answer 2 · 2011-09-25T17:36:57.590

The order of the scan (oldest-to-youngest or youngest-to-oldest) is irrelevant because a FULL table scan is required. Every row must be visited. Although you mention 'the row I'm looking for' the condition

                    .... where col = 'a'

might return one row, or it might return 10 rows, or 500 rows. SQLite cannot simply stop and call it a good day's work when the first matching row is encountered unless you use the LIMIT directive.

EDIT: What you could do, however, is use a timestamp column, index it, and then use an inline view to get the relatively recent rows:

              select * from
              (
               select * from T 
               where datecreated > {somerecentdate}
              ) as myView
              where myView.someColumn = 'a'

or simply

            select * from T where datecreated > {some date} and somecolumn = 'a'

That approach could be an iterative process -- if no rows are returned you may need to requery with a wider time-window. But if you're going to index datecreated you might as well index [someColumn].

It will always return one row, because the values in the column are unique. I do not use UNIQUE or INDEX because it slows down the inserts too much. My initial thought was to use the same iterative process as you describe above, and there isn't even a need for a extra 'datecreated' column, you can just do it on the row's id, since they always increment. But I hoped there was a build-in solution, without implementing a smart algoritm myself. — Maestro, Sep 25 '11 at 18:28
That it will always return one row is not a fact that SQLite can know, if the column is not indexed. A FULL table scan will always be necessary. If there's a LIMIT() optimization, you can use LIMIT 1. — Tim, Sep 25 '11 at 19:32

SQLite: Optimize table scans

2 Answers2