0

I have an SQLite database containing an R*tree virtual table. This table is behaving rather oddly and I'm at a loss as to what is wrong. I would appreciate any pointers to aspects I could investigate!

> dbGetQuery(con, 'PRAGMA integrity_check')
  integrity_check
1              ok

It seems fine...

> dbGetQuery(con, 'SELECT * FROM peakLoc LIMIT 5')
  peakID scanStart scanEnd    mzMin    mzMax
1  18481      5540    5904 435.1880 435.2095
2  18429      5555    5644 408.7411 408.7459
3  18251      5621    5710 432.7190 432.7285
4  16415      6081    6173 432.2292 432.2470
5  16391      6089    6351 454.1823 454.1960

The general look of the R*tree table

> dbGetQuery(con, 'SELECT MIN(scanEnd), MAX(scanEnd) FROM peakLoc')
  MIN(scanEnd) MAX(scanEnd)
1           51        19369

The bounds of scanEnd

> dbGetQuery(con, 'SELECT * FROM peakLoc WHERE scanEnd > 5000 LIMIT 5')
  peakID scanStart scanEnd    mzMin    mzMax
1  20987      4839    6284 410.1729 410.2035
2   6705      9827   10132 738.8564 738.8674
3  15190      6482    6756 615.3235 615.3395
4  15189      6482    6756 509.2193 509.2258
5  12001      7449    7710 855.4534 855.4631

So far so good...

> dbGetQuery(con, 'SELECT * FROM peakLoc WHERE scanEnd > 6000 LIMIT 5')
[1] peakID    scanStart scanEnd   mzMin     mzMax    
<0 rows> (or 0-length row.names)

Where are the records?

The same is happening for the other columns with bigger-than once the comparator gets to an arbitrary large number. This behaviour is only present in the R*tree table - the regular tables works fine...

Have I stumbled upon a constraint in the R*tree module that I do not know about? All records in the R*tree comes from one big insert and I have not touched the underlying tables that the R*tree relies on...

edit: On request from CL I've tried to create a reproducible example. At least on my system the following produces an R*tree with the same behaviour:

set.seed(1)
library(RSQLite)
con <- dbConnect(dbDriver('SQLite'), ':memory:')
dbGetQuery(con, 'CREATE VIRTUAL TABLE test USING rtree(id, xmin, xmax, ymin, ymax)')
x <- abs(rnorm(100))
y <- abs(rnorm(100))
data <- data.frame(id=1:100, xmin=x, xmax=x+2, ymin=y, ymax=y+3)
dbGetPreparedQuery(con, 'INSERT INTO test VALUES ($id, $xmin, $xmax, $ymin, $ymax)', bind.data=data)
dbGetQuery(con, 'SELECT max(xmax) FROM test')
dbGetQuery(con, 'SELECT * FROM test WHERE xmax > 4 LIMIT 5')
dbGetQuery(con, 'SELECT * FROM test WHERE +xmax > 4 LIMIT 5')

edit 2: A database created with the commands given in the first edit can be downloaded from this link: https://dl.dropboxusercontent.com/u/2323585/testdb.sqlite

ThomasP85
  • 1,624
  • 2
  • 15
  • 26
  • Any difference with `SELECT * FROM peakLoc WHERE +scanEnd > 6000 LIMIT 5`? – CL. Nov 08 '14 at 13:37
  • Yes - why should this make a difference? And why does it only kick in after a certain threshold? – ThomasP85 Nov 08 '14 at 17:49
  • Different results with and without `+` indicate a bug. Can you create a reproducible example (as small as possible)? – CL. Nov 08 '14 at 19:06
  • @CL. I just added a minimal example with random data. It seems worryingly easy to provoke this bug... – ThomasP85 Nov 11 '14 at 11:48
  • RSQLite appears to be broken in the current R version. I was not able to reproduce this with other random data in plain SQLite. Could you save the DB to a file? – CL. Nov 11 '14 at 13:34
  • Link to sqlite file added – ThomasP85 Nov 13 '14 at 14:32
  • I can reproduce this when querying the database file, but not when I insert the same data in SQLite 3.8.7.1. Which version is your R using? (`SELECT sqlite_version()`) – CL. Nov 13 '14 at 15:24
  • It's 3.7.17, the last 3.7.x before 3.8. I cannot find any mention of the bug in the release notes following 3.7.17, but it might have been silently fixed... – ThomasP85 Nov 13 '14 at 18:46
  • Just updated to 3.8.6 and gets the same bug. I've also tried to create the table line by line instead of using bind.data, but to no avail – ThomasP85 Nov 13 '14 at 19:02
  • The only *possible* change is the partial rewrite in version 3.8.5. Are you sure that you are using 3.8.6 *in R*? – CL. Nov 13 '14 at 19:07
  • The sqlite version that comes bundled with the latest version of RSQLite is 3.8.6 - this is also what I get when I SELECT sqlite_version() in the database I create using the example above – ThomasP85 Nov 13 '14 at 19:09
  • I also just tried adding the data to a another regular table and then inserting into the R*tree from there (to see if it was a problem with value passing in the RSQLite package). Still the same bug... – ThomasP85 Nov 13 '14 at 19:13
  • Just downloaded R 3.1.2 for Windows (with SQLite 3.8.6), and it works fine (both 32 and 64 bits). – CL. Nov 14 '14 at 08:19
  • Ok - so this is definitely a Mac problem. thanks for the help! I'll move this to an issue at RSQLite or sqlite – ThomasP85 Nov 14 '14 at 08:31

0 Answers0