4

I'm facing an unexpected behavior when using Exposed with a DataSource (I tried apache DBCP and HikariCP).

Setup: single table (test) with id and flag fields with an index on flag.

Query: SELECT * from test where flag=1 limit 1;

When run manually, the index is used and the query is fast. When run repeatedly via exposed, after 9 invocations, the performance degrades. The index is not used anymore - see query plans below.

Here is example code:


object TestTable : IntIdTable() {
    val flag = integer("flag").index()
}

fun createNRows(n: Int) = repeat(n) {
    TestTable.insert { it[flag] = 0 }
}

fun main(args: Array<String>) {
    val ds = HikariDataSource(HikariConfig().apply {
        jdbcUrl = "jdbc:postgresql://localhost:5432/testdb"
        username = ...
        password = ...
        setDriverClassName("org.postgresql.Driver")
    })

    Database.connect(ds)


    transaction {
        // only run the first time:
        // SchemaUtils.create(TestTable)
        // createNRows(1000000) 
        println("total ${TestTable.selectAll().count()} elements")
    }

    repeat(10000) {
        transaction {
            val startedAt = System.currentTimeMillis()
            TestTable.select { TestTable.flag.eq(1) }.limit(1).toList()
            println("Query took ${System.currentTimeMillis() - startedAt}")
        }
    }
}

Output:

total 1000000 elements
Query took 6
Query took 1
Query took 1
Query took 1
Query took 1
Query took 1
Query took 1
Query took 1
Query took 0
Query took 79
Query took 64
Query took 63
Query took 62
Query took 63
....

Here are the postgres logs with EXPLAIN (ANALYZE, BUFFERS) enabled:

This is the last fast query:

2020-03-10 23:03:00.596 CET [71012] LOG:  duration: 0.021 ms  bind S_2: 
2020-03-10 23:03:00.597 CET [71012] LOG:  duration: 0.083 ms  parse <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL REPEATABLE READ
2020-03-10 23:03:00.597 CET [71012] LOG:  duration: 0.013 ms  bind <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL REPEATABLE READ
2020-03-10 23:03:00.597 CET [71012] LOG:  execute <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL REPEATABLE READ
2020-03-10 23:03:00.597 CET [71012] LOG:  duration: 0.025 ms
2020-03-10 23:03:00.597 CET [71012] LOG:  duration: 0.011 ms  bind S_3: BEGIN
2020-03-10 23:03:00.597 CET [71012] LOG:  execute S_3: BEGIN
2020-03-10 23:03:00.597 CET [71012] LOG:  duration: 0.015 ms
2020-03-10 23:03:00.598 CET [71012] LOG:  duration: 0.159 ms  bind S_4: SELECT test.id, test.flag FROM test WHERE test.flag = $1 LIMIT 1
2020-03-10 23:03:00.598 CET [71012] DETAIL:  parameters: $1 = '1'
2020-03-10 23:03:00.598 CET [71012] LOG:  execute S_4: SELECT test.id, test.flag FROM test WHERE test.flag = $1 LIMIT 1
2020-03-10 23:03:00.598 CET [71012] DETAIL:  parameters: $1 = '1'
2020-03-10 23:03:00.598 CET [71012] LOG:  duration: 0.028 ms
2020-03-10 23:03:00.598 CET [71012] LOG:  duration: 0.015 ms  plan:
    Query Text: SELECT test.id, test.flag FROM test WHERE test.flag = $1 LIMIT 1
    Limit  (cost=0.42..4.44 rows=1 width=8) (actual time=0.013..0.013 rows=0 loops=1)
      Buffers: shared hit=3
      ->  Index Scan using test_flag on test  (cost=0.42..4.44 rows=1 width=8) (actual time=0.012..0.012 rows=0 loops=1)
            Index Cond: (flag = 1)
            Buffers: shared hit=3
2020-03-10 23:03:00.598 CET [71012] LOG:  duration: 0.072 ms  bind S_1: COMMIT
2020-03-10 23:03:00.598 CET [71012] LOG:  execute S_1: COMMIT
2020-03-10 23:03:00.598 CET [71012] LOG:  duration: 0.017 ms
2020-03-10 23:03:00.599 CET [71012] LOG:  duration: 0.022 ms  parse <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED
2020-03-10 23:03:00.599 CET [71012] LOG:  duration: 0.007 ms  bind <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED
2020-03-10 23:03:00.599 CET [71012] LOG:  execute <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED
2020-03-10 23:03:00.599 CET [71012] LOG:  duration: 0.013 ms

And here is the first "slow" one:

2020-03-10 23:03:01.601 CET [71012] LOG:  duration: 0.022 ms  bind S_2: 
2020-03-10 23:03:01.602 CET [71012] LOG:  duration: 0.052 ms  parse <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL REPEATABLE READ
2020-03-10 23:03:01.602 CET [71012] LOG:  duration: 0.011 ms  bind <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL REPEATABLE READ
2020-03-10 23:03:01.602 CET [71012] LOG:  execute <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL REPEATABLE READ
2020-03-10 23:03:01.602 CET [71012] LOG:  duration: 0.023 ms
2020-03-10 23:03:01.602 CET [71012] LOG:  duration: 0.012 ms  bind S_3: BEGIN
2020-03-10 23:03:01.602 CET [71012] LOG:  execute S_3: BEGIN
2020-03-10 23:03:01.602 CET [71012] LOG:  duration: 0.015 ms
2020-03-10 23:03:01.602 CET [71012] LOG:  duration: 0.192 ms  bind S_4: SELECT test.id, test.flag FROM test WHERE test.flag = $1 LIMIT 1
2020-03-10 23:03:01.602 CET [71012] DETAIL:  parameters: $1 = '1'
2020-03-10 23:03:01.602 CET [71012] LOG:  execute S_4: SELECT test.id, test.flag FROM test WHERE test.flag = $1 LIMIT 1
2020-03-10 23:03:01.602 CET [71012] DETAIL:  parameters: $1 = '1'
2020-03-10 23:03:01.678 CET [71012] LOG:  duration: 75.889 ms
2020-03-10 23:03:01.679 CET [71012] LOG:  duration: 75.868 ms  plan:
    Query Text: SELECT test.id, test.flag FROM test WHERE test.flag = $1 LIMIT 1
    Limit  (cost=0.00..0.02 rows=1 width=8) (actual time=75.864..75.864 rows=0 loops=1)
      Buffers: shared hit=96 read=4329
      ->  Seq Scan on test  (cost=0.00..16925.00 rows=1000000 width=8) (actual time=75.862..75.862 rows=0 loops=1)
            Filter: (flag = $1)
            Rows Removed by Filter: 1000000
            Buffers: shared hit=96 read=4329
2020-03-10 23:03:01.679 CET [71012] LOG:  duration: 0.054 ms  bind S_1: COMMIT
2020-03-10 23:03:01.679 CET [71012] LOG:  execute S_1: COMMIT
2020-03-10 23:03:01.679 CET [71012] LOG:  duration: 0.014 ms
2020-03-10 23:03:01.679 CET [71012] LOG:  duration: 0.025 ms  parse <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED
2020-03-10 23:03:01.679 CET [71012] LOG:  duration: 0.004 ms  bind <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED
2020-03-10 23:03:01.679 CET [71012] LOG:  execute <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED
2020-03-10 23:03:01.679 CET [71012] LOG:  duration: 0.009 ms

Postgres version (Homebrew):

postgres (PostgreSQL) 11.5

Client versions:


dependencies {
    implementation 'org.jetbrains.exposed:exposed:0.17.7'
    implementation "org.postgresql:postgresql:42.2.8"
    implementation 'org.jetbrains.kotlin:kotlin-stdlib'
    implementation 'com.zaxxer:HikariCP:2.3.2'
}

The postgres config is default (the logs are generated with auto-explain, but the issue reproduces without it)

Here is the source of the example: https://github.com/RomanBrodetski/kotlin-exposed-issue

Observations:

  • if .limit(1) is removed, the issue is not reproduced
  • if a Datasource is not used (Database.connect("jdbc:postgresql://localhost:5432/testdb", driver = "org.postgresql.Driver") instead of Database.connect(ds)), the issue is not reproduced
  • if there are additional statements in transaction, the issue is not reproduced.
roman-roman
  • 2,746
  • 19
  • 27
  • 1
    Can you show `EXPLAIN (ANALYZE, BUFFERS)` output for one of the slower runs? You can use auto_explain for that. – Laurenz Albe Mar 10 '20 at 19:31
  • @LaurenzAlbe Thanks for the suggestion, please see the updated post – roman-roman Mar 10 '20 at 22:10
  • The difference is that the slow query is using a server side prepared statement with a generic plan. How are the values in the column distributed? Has `ANALYZE` been run on the table? – Laurenz Albe Mar 11 '20 at 04:04
  • @LaurenzAlbe there are no or very few elements with `flag == 1`. I tried to run `ANALYZE`, but nothing have changed. – roman-roman Mar 11 '20 at 07:43
  • How are the values other than 1 distributed in the table column? – Laurenz Albe Mar 11 '20 at 07:53
  • @LaurenzAlbe, all others are zeroes. – roman-roman Mar 11 '20 at 07:55
  • I cannot reproduce that. Can you tag your PostgreSQL version? Also, could you add all non-default parameters you are using to the question? – Laurenz Albe Mar 11 '20 at 08:04
  • @LaurenzAlbe I added the versions and pushed the example to github. Thanks! – roman-roman Mar 11 '20 at 08:39
  • 1
    Sounds like parameter sniffing/bind variable issue. `SELECT * from test where flag=1 limit 1;` - flag is hardcoded, actual code `SELECT test.id, test.flag FROM test WHERE test.flag = $1 LIMIT 1` - flag is a parameter(the plan has to be more generic to satisfy all cases) https://use-the-index-luke.com/blog/2011-07-16/planning-for-reuse – Lukasz Szozda Mar 21 '20 at 08:12
  • Second thing: LIMIT without ORDER BY in most of the cases does not make sense(unless you do not care which value is taken) – Lukasz Szozda Mar 21 '20 at 08:19
  • JDBC driver: https://jdbc.postgresql.org/documentation/head/server-prepare.html - caching – Lukasz Szozda Mar 21 '20 at 08:39
  • If you have no or very few flag=1 and you know in advance finding flag=1 is important, then using partial index might help. ```CREATE INDEX test_flag_true_idx ON test (flag) WHERE flag = 1;``` – Jindra Mar 24 '20 at 20:48
  • @Jindra, I tried that - same result.. – roman-roman Mar 24 '20 at 21:52

2 Answers2

3

Completely removing .limit(1) makes it use index all the time. The problem is the general plan that is created for prepared statement after few(5) executions is wrong. The limit 1 is what makes it so. Making the 1 a bind variable solves the problem. Unfortunately I didn't find a way to do that in the Exposed library - it inlines the number into the prepared statement.

For some reason it thinks it can find a matching row immediately during sequential scan and no matter what vacuum/analyze/create statistics I do I cannot change it's mind. (I tried changing distribution of flag values, didn't help)

Reproducing the issue from SQL:

create index test_flag_partial_idx on test (flag) include (id) where flag is not null and flag = 1;

vacuum  full  analyse  test;

PREPARE select_with_limit_as_value AS SELECT test.id, test.flag FROM test WHERE test.flag IS NOT NULL AND test.flag = $1 LIMIT 1;
EXECUTE select_with_limit_as_value(1);
EXECUTE select_with_limit_as_value(1);
EXECUTE select_with_limit_as_value(1);
EXECUTE select_with_limit_as_value(1);
EXECUTE select_with_limit_as_value(1);
EXECUTE select_with_limit_as_value(1);


PREPARE select_with_limit_as_bind AS SELECT test.id, test.flag FROM test WHERE test.flag IS NOT NULL AND test.flag = $1 LIMIT $2;
EXECUTE select_with_limit_as_bind(1, 1);
EXECUTE select_with_limit_as_bind(1, 1);
EXECUTE select_with_limit_as_bind(1, 1);
EXECUTE select_with_limit_as_bind(1, 1);
EXECUTE select_with_limit_as_bind(1, 1);
EXECUTE select_with_limit_as_bind(1, 1);

The first prepared statement uses limit as hard-coded value and after few executions switches to general plan that uses sequential scan. The second prepared statement uses limit as bind variable and the general plan uses index.

You need to either hard-code the flag param into the query or make the limit a bind variable.

In PostgreSQL 12 you can disable generic plans, you could change it before and after your query:

set plan_cache_mode = force_custom_plan;

All tried on PostgreSQL 12.2.

Jindra
  • 780
  • 13
  • 39
0

Try collecting table stats after you insert data. Looks like CBO has less statistics to understand the table structure. Actually, it is not a bad idea for postgres to not use the index you created because all the values of the index is the same. Therefore, next thing to try can be removing the index from your code, or creating a better index.

In the end it doesn't seem to be related to Exposed but to Postgresql itself.

(I wanted to comment, but it is not possible because of my SO reputation, therefore wrote an answer)

Kemal Cengiz
  • 133
  • 1
  • 8