0

I realize of course that figuring out these issues can be complex and require lots of info but I'm hoping there is a known issue or workaround for this particular case. I've narrowed down the change in the query that causes the sub-optimal query plan (this is running Postgres 9.4).

The following query runs in about 50ms. The tag_device table is a junction table with ~2 million entries, the devices table has about 1.5 million entries and the tags table has about 500,000 entries (Note: the actual IP values are just made up).

WITH inner_query AS (
  SELECT * FROM tag_device
  INNER JOIN tags  ON tag_device.tag_id = tags.id
  INNER JOIN devices ON tag_device.device_id = devices.id
  WHERE devices.device_ip <<= ANY(ARRAY[
    '10.0.0.1', '10.0.0.2', '10.0.0.5', '11.1.1.1', '12.2.2.35','13.0.0.1', '15.0.0.8', '1.160.0.1', '17.1.1.24', '18.2.2.1',
    '10.0.0.6', '10.0.0.21', '10.0.0.52', '11.1.1.2', '12.2.2.34','13.0.0.2', '15.0.0.7', '1.160.0.2', '17.1.1.23', '18.2.2.2',
    '10.0.0.7', '10.0.0.22', '10.0.0.53', '11.1.1.3', '12.2.2.33','13.0.0.3', '15.0.0.6', '1.160.0.3', '17.1.1.22', '18.2.2.3'
    ]::iprange[])
 ))
 SELECT * FROM inner_query LIMIT 100 OFFSET 0;

A few things to note. device_ip is using the ip4r module (https://github.com/RhodiumToad/ip4r) to provide ip range lookups and this column has a gist index on it. The above query runs in about 50ms using the following query plan:

Limit  (cost=140367.19..140369.19 rows=100 width=239)
  CTE inner_query
    ->  Nested Loop  (cost=40147.63..140367.19 rows=56193 width=431)
          ->  Merge Join  (cost=40147.20..113345.15 rows=56193 width=261)
                Merge Cond: (tag_device.device_id = devices.id)
                ->  Index Scan using tag_device_device_id_idx on tag_device  (cost=0.43..67481.36 rows=1900408 width=51)
                ->  Materialize  (cost=40136.82..40402.96 rows=53228 width=210)
                      ->  Sort  (cost=40136.82..40269.89 rows=53228 width=210)
                            Sort Key: devices.id
                            ->  Bitmap Heap Scan on devices  (cost=1489.12..30498.45 rows=53228 width=210)
                                  Recheck Cond: (device_ip <<= ANY ('{10.0.0.1,10.0.0.2,10.0.0.5,11.1.1.1,12.2.2.2,13.0.0.1,15.0.0.2,1.160.0.5,17.1.1.1,18.2.2.2,10.0.0.1,10.0.0.2,10.0.0.5,11.1.1.1,12.2.2.2,13.0.0.1,15.0.0.2,1.160.0.5,17.1.1.1,18.2.2.2 (...)
                                  ->  Bitmap Index Scan on devices_iprange_idx  (cost=0.00..1475.81 rows=53228 width=0)
                                        Index Cond: (device_ip <<= ANY ('{10.0.0.1,10.0.0.2,10.0.0.5,11.1.1.1,12.2.2.2,13.0.0.1,15.0.0.2,1.160.0.5,17.1.1.1,18.2.2.2,10.0.0.1,10.0.0.2,10.0.0.5,11.1.1.1,12.2.2.2,13.0.0.1,15.0.0.2,1.160.0.5,17.1.1.1,18.2 (...)
          ->  Index Scan using tags_id_pkey on tags  (cost=0.42..0.47 rows=1 width=170)
                Index Cond: (id = tag_device.tag_id)
  ->  CTE Scan on inner_query  (cost=0.00..1123.86 rows=56193 width=239)

If I increase the number of IP addresses in the ARRAY being looked up then the query plan changes and becomes drastically slower. So in the fast version of the query there are 30 items in the array. If I increase this to 80 items in the array then the query plan changes and becomes significantly slower (over 10x) The query remains the same in all other ways. The new query plan makes use of hash joins instead of merge joins and nested loops. Here is the new (much slower) query plan for when the array has 80 items in it instead of 30.

Limit  (cost=204482.39..204484.39 rows=100 width=239)
  CTE inner_query
    ->  Hash Join  (cost=85839.13..204482.39 rows=146180 width=431)
          Hash Cond: (tag_device.tag_id = tags.id)
          ->  Hash Join  (cost=51368.40..145023.34 rows=146180 width=261)
                Hash Cond: (tag_device.device_id = devices.id)
                ->  Seq Scan on tag_device  (cost=0.00..36765.08 rows=1900408 width=51)
                ->  Hash  (cost=45580.57..45580.57 rows=138466 width=210)
                      ->  Bitmap Heap Scan on devices  (cost=3868.31..45580.57 rows=138466 width=210)
                            Recheck Cond: (device_ip <<= ANY ('{10.0.0.1,10.0.0.2,10.0.0.5,11.1.1.1,12.2.2.35,13.0.0.1,15.0.0.8,1.160.0.1,17.1.1.24,18.2.2.1,10.0.0.6,10.0.0.21,10.0.0.52,11.1.1.2,12.2.2.34,13.0.0.2,15.0.0.7,1.160.0.2,17.1.1.23,18.2.2.2 (...)
                            ->  Bitmap Index Scan on devices_iprange_idx  (cost=0.00..3833.70 rows=138466 width=0)
                                  Index Cond: (device_ip <<= ANY ('{10.0.0.1,10.0.0.2,10.0.0.5,11.1.1.1,12.2.2.35,13.0.0.1,15.0.0.8,1.160.0.1,17.1.1.24,18.2.2.1,10.0.0.6,10.0.0.21,10.0.0.52,11.1.1.2,12.2.2.34,13.0.0.2,15.0.0.7,1.160.0.2,17.1.1.23,18.2 (...)
          ->  Hash  (cost=16928.88..16928.88 rows=475188 width=170)
                ->  Seq Scan on tags  (cost=0.00..16928.88 rows=475188 width=170)
  ->  CTE Scan on inner_query  (cost=0.00..2923.60 rows=146180 width=239)

The above query with it's default query plan runs in about 500ms (over 10 times slower). If I turn off hash joins using SET enable_hashjoin= OFF; then the query plan goes back to using merge joins and runs in ~50ms again with 80 items in the array.

Again the only change here is the number of items in the ARRAY that are being looked up.

Does anyone have any thoughts on why the planner is making the poor choice that results in the massive slow down?

The database fits into memory completely and is on SSDs.

I also want to point out that I'm using a CTE because I ran into an issue where the planner would not use the index on the tag_device table when I added in the limit to the query. Basically the issue described here: http://thebuild.com/blog/2014/11/18/when-limit-attacks/.

Thanks!

Sarus
  • 3,303
  • 1
  • 23
  • 27
  • It hurts that it's Postgresql and you cannot use hints... Did you try to play with default_statistics_target and ANALYZE. – dcieslak Oct 08 '15 at 14:12
  • @dcieslak Thank you for the comment. I did up the default_statistics_target to 10000 for all the pertinent columns and reanalyzed. It still got it wrong. For now I'm just dealing by having to set enable_hashjoin=OFF; – Sarus Oct 16 '15 at 16:38

1 Answers1

0

I see that there is a sort as part of the merge join. Once you get past a certain threshold the sort operation needed to do the merge join is deemed to be too expensive and a hash join is estimated to be cheaper. It may be more expensive (time wise) but cheaper in terms of CPU consumption to run the query this way.

Hunter Nelson
  • 1,707
  • 3
  • 20
  • 37