Postgresql Explain Plan Differences

Question

this is my first post....

I have a query that is taking longer than I would like (don't we all!) Depending on what I put in the WHERE clause...it MAY run faster. I am trying to understand why the query plan is different AND what i can do to speed the query up over all.

Here's Query #1:

SELECT date_observed, base_value 
FROM device_read_data 
WHERE fk_device_rw_id IN 
(SELECT fk_device_rw_id FROM equipment_set_rw 
WHERE fk_equipment_set_id = CAST('ed151028-1fc0-11e3-b79f-47c0fd87d2b4' AS uuid))
AND date_observed 
BETWEEN '2013-12-01 07:45:00+00'::timestamptz
 AND '2014-01-01 07:59:59+00'::timestamptz
AND base_value ~ '[0-9]+(\.[0-9]+)?'
;

Here's Query Plan #1:

"Hash Semi Join  (cost=11.65..5640243.59 rows=92194 width=16) (actual time=34.947..132522.023 rows=43609 loops=1)"
"  Hash Cond: (device_read_data.fk_device_rw_id = equipment_set_rw.fk_device_rw_id)"
"  ->  Seq Scan on device_read_data  (cost=0.00..5449563.56 rows=72157042 width=32) (actual time=0.844..123760.331 rows=71764376 loops=1)"
"        Filter: ((date_observed >= '2013-12-01 07:45:00+00'::timestamp with time zone)    AND (date_observed <= '2014-01-01 07:59:59+00'::timestamp with time zone) AND   ((base_value)::text ~ '[0-9]+(\.[0-9]+)?'::text))"
"        Rows Removed by Filter: 82135660"
"  ->  Hash  (cost=11.61..11.61 rows=3 width=16) (actual time=0.018..0.018 rows=1 loops=1)"
"        Buckets: 1024  Batches: 1  Memory Usage: 1kB"
"        ->  Bitmap Heap Scan on equipment_set_rw  (cost=4.27..11.61 rows=3 width=16) (actual time=0.016..0.016 rows=1 loops=1)"
"              Recheck Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"              ->  Bitmap Index Scan on uc_fk_equipment_set_id_fk_device_rw_id  (cost=0.00..4.27 rows=3 width=0) (actual time=0.011..0.011 rows=1 loops=1)"
"                    Index Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"Total runtime: 132530.290 ms"

Here's Query #2:

SELECT date_observed, base_value 
FROM device_read_data 
WHERE fk_device_rw_id IN 
(SELECT fk_device_rw_id FROM equipment_set_rw 
WHERE fk_equipment_set_id = CAST('ed151028-1fc0-11e3-b79f-47c0fd87d2b4' AS uuid))
AND date_observed 
BETWEEN  '2014-01-01 07:45:00+00'::timestamptz
 AND '2014-02-01 07:59:59+00'::timestamptz
AND base_value ~ '[0-9]+(\.[0-9]+)?'
;

Here's Query Plan #2:

"Nested Loop  (cost=4.27..1869543.46 rows=20391 width=16) (actual time=0.041..2053.656 rows=12997 loops=1)"
"  ->  Bitmap Heap Scan on equipment_set_rw  (cost=4.27..9.73 rows=2 width=16) (actual time=0.015..0.017 rows=1 loops=1)"
"        Recheck Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"        ->  Bitmap Index Scan on uc_fk_equipment_set_id_fk_device_rw_id  (cost=0.00..4.27 rows=2 width=0) (actual time=0.010..0.010 rows=1 loops=1)"
"              Index Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"  ->  Index Scan using idx_device_read_data_date_observed_fk_device_rw_id on device_read_data  (cost=0.00..934664.91 rows=10195 width=32) (actual time=0.024..2050.656 rows=12997 loops=1)"
"        Index Cond: ((date_observed >= '2014-01-01 07:45:00+00'::timestamp with time zone) AND (date_observed <= '2014-02-01 07:59:59+00'::timestamp with time zone) AND (fk_device_rw_id = equipment_set_rw.fk_device_rw_id))"
"        Filter: ((base_value)::text ~ '[0-9]+(\.[0-9]+)?'::text)"
"Total runtime: 2055.068 ms"

I've only changed the Date Range in the Where clause. You can see that in Query #1 there is a Seq Scan on the table VS an Index Scan in Query #2.

I'm trying to determine what is causing this, but I can't seem to find the answer.

Additional Information

There is a composite index on (date_observed, fk_device_rw_id)
There are never any deletes on this table. Autovacuum is not needed.
I vacuumed the table anyway....but this had no effect.
I've rebuilt the Index on this table
I've Analyzed this table
This system is a copy of Prod and is currently Idle

System Information

Running Postgres 9.2 on Linux
16GB System Ram
Shared_Buffers set to 4GB

What other information can I provide? I am sure there are things I have left out.

Thanks for your help.

Edit 1

I tried: set enable_seqscan = false

Here are the Explain Plan Results:

"Hash Semi Join  (cost=2566484.50..7008502.81 rows=92194 width=16) (actual  time=18587.453..182228.966 rows=43609 loops=1)"
"  Hash Cond: (device_read_data.fk_device_rw_id = equipment_set_rw.fk_device_rw_id)"
"  ->  Bitmap Heap Scan on device_read_data  (cost=2566472.85..6817822.78 rows=72157042 width=32) (actual time=18562.247..172074.048 rows=71764376 loops=1)"
"        Recheck Cond: ((date_observed >= '2013-12-01 07:45:00+00'::timestamp with time zone) AND (date_observed <= '2014-01-01 07:59:59+00'::timestamp with time zone))"
"        Rows Removed by Index Recheck: 2102"
"        Filter: ((base_value)::text ~ '[0-9]+(\.[0-9]+)?'::text)"
"        Rows Removed by Filter: 12265137"
"        ->  Bitmap Index Scan on idx_device_read_data_date_observed_fk_device_rw_id  (cost=0.00..2548433.59 rows=85430682 width=0) (actual time=18556.228..18556.228 rows=84029513 loops=1)"
"              Index Cond: ((date_observed >= '2013-12-01 07:45:00+00'::timestamp with time zone) AND (date_observed <= '2014-01-01 07:59:59+00'::timestamp with time zone))"
"  ->  Hash  (cost=11.61..11.61 rows=3 width=16) (actual time=16.134..16.134 rows=1 loops=1)"
"        Buckets: 1024  Batches: 1  Memory Usage: 1kB"
"        ->  Bitmap Heap Scan on equipment_set_rw  (cost=4.27..11.61 rows=3 width=16) (actual time=16.128..16.129 rows=1 loops=1)"
"              Recheck Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"              ->  Bitmap Index Scan on uc_fk_equipment_set_id_fk_device_rw_id  (cost=0.00..4.27 rows=3 width=0) (actual time=16.116..16.116 rows=1 loops=1)"
"                    Index Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"Total runtime: 182244.181 ms"

As predicted, the query took longer. Are there just too may records to make this faster?

What are my choices?

Thanks.

Edit 2

I tried the re-write approach. I'm afraid the results were similar to the original. Here's the query Plan:

"Hash Join  (cost=11.65..6013386.19 rows=90835 width=16) (actual time=35.272..127965.785 rows=43609 loops=1)"
"  Hash Cond: (a.fk_device_rw_id = b.fk_device_rw_id)"
"  ->  Seq Scan on device_read_data a  (cost=0.00..5565898.74 rows=71450793 width=32) (actual time=13.050..119667.814 rows=71764376 loops=1)"
"        Filter: ((date_observed >= '2013-12-01 07:45:00+00'::timestamp with time zone) AND (date_observed <= '2014-01-01 07:59:59+00'::timestamp with time zone) AND ((base_value)::text ~ '[0-9]+(\.[0-9]+)?'::text))"
"        Rows Removed by Filter: 85426425"
"  ->  Hash  (cost=11.61..11.61 rows=3 width=16) (actual time=0.018..0.018 rows=1 loops=1)"
"        Buckets: 1024  Batches: 1  Memory Usage: 1kB"
"        ->  Bitmap Heap Scan on equipment_set_rw b  (cost=4.27..11.61 rows=3 width=16) (actual time=0.015..0.016 rows=1 loops=1)"
"              Recheck Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"              ->  Bitmap Index Scan on uc_fk_equipment_set_id_fk_device_rw_id  (cost=0.00..4.27 rows=3 width=0) (actual time=0.011..0.011 rows=1 loops=1)"
"                    Index Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"Total runtime: 127992.849 ms"

It seems like a simple problem. Return records from a table that fall in a particular date range. Given my existing system architecture, perhaps there's a threshold of how many records that can exist in the table before performance is adversely affected.

Unless there are other suggestions, I may need to pursue the partitioning approach.

Thanks for the help thus far!

Added links to the plans on explain.depesz.com in-text above. Thanks for the incredibly complete question. I'd +10 this one if I could. Will try to find time to come back and look into it, currently on deadline. — Craig Ringer, Jan 14 '14 at 01:03

yieldsfalsehood · Answer 1 · 2014-01-14T19:27:12.067

In your first query your date range spans a full month, as opposed to just the one day in the second query. The date range in the first query matches 72M rows out of about 154M rows in device_read_data, which is nearly half of the rows in that table.

Index scans are generally slower than full table scans for that many rows (because an index scan has to read index pages and data pages, the total number of disk reads required to get that many rows is likely larger than just reading every data page).

You can set enable_seq_scan = false before running the first query to see the difference, and if you're feeling adventurous run your explain as explain (analyze, buffers) <query> to see how many block reads you get when doing a table scan versus an index scan.

Edit: For your specific problem you might have some luck using partial indexes. You'll have to figure out how to build these so that they cast as wide a net as possible (it's tempting but wasteful to write a partial index per problem) but you might start with something like this:

create index idx_device_read_data_date_observed_base_value
on device_read_data (date_observed)
where base_value ~ '[0-9]+(\.[0-9]+)?'
;

That index will only be built for those rows matching that base_value pattern. You'd know better than we would if that's a fairly restrictive condition or not (it'd be good for you if it did reduce the number of rows to consider).

You might also flip that idea and index on base_value matching that pattern and make your where conditions something like date_observed between '2013-12-01 and '2013-12-31', adding one such index for each month (this way is likely to get out of hand with just indexes - I'd switch to partitioning).

Another potential improvement could come from re-writing your query. Here's an approach that eliminates the IN condition, which provides the same results if there are no repeats of fk_device_rw_id in equipment_set_rw for the given fk_equipment_set_id.

SELECT a.date_observed, a.base_value 
FROM device_read_data a
       join equipment_set_rw b
         on a.fk_device_rw_id = b.fk_device_rw_id
WHERE b.fk_equipment_set_id = CAST('ed151028-1fc0-11e3-b79f-47c0fd87d2b4' AS uuid)
  AND a.date_observed BETWEEN '2014-01-01 07:45:00+00'::timestamptz
                            AND '2014-02-01 07:59:59+00'::timestamptz
  AND a.base_value ~ '[0-9]+(\.[0-9]+)?'
;

Thanks very much for the feedback. Actually both queries are 1 month. (yyyy-mm-dd). The record count for Query#1 is: 43609, Query #2: 17318, device_read_data table: 153900036. I'll try what you've suggested. — iamtheoracle, Jan 13 '14 at 22:32
I misread and thought the second was just one day. I added a little more specific advice on things to try. — yieldsfalsehood, Jan 14 '14 at 14:12
thanks again. I tried the partial index, i'm afraid that didn't help. I don't believe the index per month suggestion is reasonable at this time. I do have another DB instance where I've been building the Partitioning components...in anticipation of this exact performance issue. I was trying to get the performance of this script working the best I could before moving to the Partitioning option. Do you think I've reached the best performance I'm going to get...given my current environment setup? Thanks. — iamtheoracle, Jan 14 '14 at 19:14
I edited to add a re-write approach - I realized you're visiting those millions of rows only to eliminate the vast majority of them to get only a few thousand results. I think there's still hope left to avoid partitioning. — yieldsfalsehood, Jan 14 '14 at 19:29
"There is a composite index on (date_observed, fk_device_rw_id)" - is that the only index, and is that the order in which the columns are specified? Can you create two separate indexes, one on each of those columns? — yieldsfalsehood, Jan 14 '14 at 20:56

score 2 · Accepted Answer · answered Jan 16 '14 at 04:12

I've tried a few things and I'm satisfied for now with the performance.

I changed the index on the device_read_data table to the reverse order of what it was.

Original Index:

CREATE UNIQUE INDEX idx_device_read_data_date_observed_fk_device_rw_id
  ON device_read_data
  USING btree (date_observed, fk_device_rw_id);

New Index:

CREATE UNIQUE INDEX idx_device_read_data_date_observed_fk_device_rw_id
  ON device_read_data
  USING btree (fk_device_rw_id, date_observed);

The fk_device_rw_id column has a much lower cardinality. Placing this column first in the index has helped to filter the records much faster.

Also, make sure the columns in the where clause are in the same order as the composite index. (Which is the case....now.)

I altered the statistics on the date_observed column. Thus giving the query planner more information to work with.

Originally it was using the postgres default of 100. I set it to this:

ALTER TABLE device_read_data ALTER COLUMN date_observed SET STATISTICS 1000;

Below are the results of the query. Much...much faster. I may be able to tweak this further with additional statistics...however, this works for now. I may be able to hold off on partitioning for a bit.

Thanks for all your help.

Query:

explain Analyze
SELECT date_observed, base_value 
FROM device_read_data 
WHERE fk_device_rw_id IN 
(SELECT fk_device_rw_id FROM equipment_set_rw 
WHERE fk_equipment_set_id = CAST('ed151028-1fc0-11e3-b79f-47c0fd87d2b4' AS uuid))
AND (date_observed >= '2013-12-01 07:45:00+00'::timestamptz AND date_observed <= '2014-        01-01 07:59:59+00'::timestamptz)
AND base_value ~ '[0-9]+(\.[0-9]+)?'
;

New Query Plan:

"Nested Loop  (cost=1197.25..264699.54 rows=59694 width=16) (actual time=25.876..493.073 rows=43609 loops=1)"
"  ->  Bitmap Heap Scan on equipment_set_rw  (cost=4.27..9.73 rows=2 width=16) (actual time=0.018..0.019 rows=1 loops=1)"
"        Recheck Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"        ->  Bitmap Index Scan on uc_fk_equipment_set_id_fk_device_rw_id  (cost=0.00..4.27 rows=2 width=0) (actual time=0.012..0.012 rows=1 loops=1)"
"              Index Cond: (fk_equipment_set_id = 'ed151028-1fc0-11e3-b79f-47c0fd87d2b4'::uuid)"
"  ->  Bitmap Heap Scan on device_read_data  (cost=1192.99..132046.43 rows=29847 width=32) (actual time=25.849..486.701 rows=43609 loops=1)"
"        Recheck Cond: ((fk_device_rw_id = equipment_set_rw.fk_device_rw_id) AND (date_observed >= '2013-12-01 07:45:00+00'::timestamp with time zone) AND (date_observed <= '2014-01-01 07:59:59+00'::timestamp with time zone))"
"        Rows Removed by Index Recheck: 2076173"
"        Filter: ((base_value)::text ~ '[0-9]+(\.[0-9]+)?'::text)"
"        ->  Bitmap Index Scan on idx_device_read_data_date_observed_fk_device_rw_id  (cost=0.00..1185.53 rows=35640 width=0) (actual time=24.000..24.000 rows=43609 loops=1)"
"              Index Cond: ((fk_device_rw_id = equipment_set_rw.fk_device_rw_id) AND (date_observed >= '2013-12-01 07:45:00+00'::timestamp with time zone) AND (date_observed <= '2014-01-01 07:59:59+00'::timestamp with time zone))"
"Total runtime: 495.506 ms"

Postgresql Explain Plan Differences

Additional Information

System Information

Edit 1

Edit 2

2 Answers2