0

Background I am using SQLite to store around 10M entries, where the size of each entry is around 1Kb. I am reading this data back in chunks of around 100K entries at a time, using multiple parallel threads. Read and writes are not going in parallel and all the writes are done before starting the reads.

Problem I am experiencing too many disk reads. Each second around 3k reads are happening and I am reading only 30Kb data in those 3k reads (Hence around 100 bytes per disk read). As the result, I am seeing a really horrible performance (It is taking around 30 minutes to read the data)

Question

  1. Is there any SQlite settings/pragmas that I can use to avoid the small size disk reads?
  2. Are there any best practices for batch parallel reads in SQlite?
  3. Does SQlite read all the results of a query in one go? Or read the results in smaller chunks? If latter is the case, then where does it stone partial out of a query

Implementation Details My using SQlite with Java and my application is running on linux. JDBC library is https://github.com/xerial/sqlite-jdbc (Version 3.20.1).

P.S I am already built the necessary Indexes and verified that no table scans are going on (using Explain Query planner)

maneet
  • 295
  • 2
  • 10

1 Answers1

1

When you are searching for data with an index, the database first looks up the value in the index, and then goes to the corresponding table row to read all the other columns.

Unless the table rows happen to be stored in the same order as the values in the index, each such table read must go to a different page.

Indexes speed up searches only if the seach reduces the number of rows. If you're going to read all (or most of the) rows anyway, a table scan will be much faster.

Parallel reads will be more efficient only if the disk can actually handle the additional I/O. On rotating disks, the additional seeks will just make things worse.

(SQLite tries to avoid storing temporary results. Result rows are computed on the fly (as much as possible) while you're stepping through the cursor.)

CL.
  • 173,858
  • 17
  • 217
  • 259
  • Thanks, CL. for the reply. What you are saying makes the perfect sense, and I can do experiments with removing indexes. I want to read the results in sorted order though. I am reading results in paginated sort of way https://www.sqlite.org/cvstrac/wiki?p=ScrollingCursor. Is there a pattern that I can use to store related results on DB/OS page? – maneet Sep 09 '18 at 19:47
  • Then you'd have to insert the row in sorted order. – CL. Sep 09 '18 at 19:59