I have lots of rows in my table in Postgres DB.
I make insertions in this table every 20 minutes, clear old entries every day and have only 2 selection options from it.
So I want to optimise time, that I wait for my selections.
First selection is kind of:
Select * from table where item=<item_id>
Second is kind of:
Select distinct(datetime) from table
So, in order to optimize 1 selection, I may make indexies for item
field. As I understood this technique works perfectly for queries where smth equals
smth.
But I don't know how to optimise my 2 selection query. I think smth like partitioning should help me, but there are several types of partitioning and I am a bit confused.
So what is the best option to optimize my queries?
Also, I am using python and Django models. If there is a good library that may do all the dirty job. That would be great. Most suitable now, that I found: http://architect.readthedocs.io/
Edit1 Thanks to Evan Carrol.
Tried to use index for second query. Command:
explain analyze select distinct time_updated from wow_auction_gordunni
Gives:
HashAggregate (cost=335091.65..335092.51 rows=86 width=8) (actual time=4246.582..4246.607 rows=91 loops=1)
Group Key: time_updated
-> Seq Scan on wow_auction_gordunni (cost=0.00..313574.92 rows=8606692 width=8) (actual time=0.047..2257.979 rows=8616562 loops=1)
Planning time: 0.080 ms
Execution time: 4246.675 ms
Then creating index and vacuum:
Create INDEX ON wow_auction_gordunni (time_updated);
VACUUM ANALYZE wow_auction_gordunni;
explain analyze select distinct time_updated from wow_auction_gordunni;
Gives following:
Unique (cost=0.43..249907.42 rows=92 width=8) (actual time=0.057..3537.626 rows=92 loops=1)
-> Index Only Scan using wow_auction_gordunni_time_updated_idx on wow_auction_gordunni (cost=0.43..228163.42 rows=8697599 width=8) (actual time=0.055..2488.408 rows=8696562 loops=1)
Heap Fetches: 85796
Planning time: 0.726 ms
Execution time: 3537.800 ms
So it seems that index helps a bit (postgres started to use index), but not dramatically.