Optimize PostgreSQL query with join, group by

Question

Thanks in advance to those who will help me. I have to optimize this PostgreSQL query, creating I suppose some indexes.

SELECT D.market, D.symbol, D.company_name, D.first_date, D.last_date, D.days
FROM (SELECT MIN(B.DIFF) AS MIN_DIFF 
      FROM (SELECT A.market AS market, A.symbol AS symbol, A.company_name AS company_name, MIN(A.date) AS first_date, MAX(A.date) AS last_date, MAX(A.date) - MIN(A.date) AS DIFF
            FROM (SELECT m.market AS market, s.company_name AS company_name, p.symbol AS symbol, p.date AS date 
                  FROM market m JOIN stock_symbols s ON (s.market = m.market) JOIN stock_price p ON (s.symbol = p.symbol)) AS A
                  GROUP BY A.market, A.symbol, A.company_name) AS B) AS C JOIN (SELECT C.market AS market, C.symbol AS symbol, C.company_name AS company_name, MIN(C.date) AS first_date, MAX(C.date) AS last_date, MAX(C.date) - MIN(C.date) AS days
                                                                                FROM (SELECT m.market AS market, s.company_name AS company_name, p.symbol AS symbol, p.date AS date 
                                                                                      FROM market m JOIN stock_symbols s ON (s.market = m.market) JOIN stock_price p ON (s.symbol = p.symbol)) AS C
                                                                                      GROUP BY C.market, C.symbol, C.company_name) AS D ON (D.days = C.MIN_DIFF)

I've tried to create an index on both the hash join conditions (s.market = m.market) and (s.symbol = p.symbol) but the DBMS doesn't accept them. When executed, I get the following actual execution plan cost:

"Hash Join  (cost=28626.01..31951.32 rows=359 width=66)"
"  Hash Cond: (((max(p.date) - min(p.date))) = (min(((max(p_1.date) - mi(p_1.date))))))"
"  ->  GroupAggregate  (cost=12697.02..15031.20 rows=71821 width=58)"
"        Group Key: m.market, p.symbol, s.company_name"
"        ->  Sort  (cost=12697.02..12876.57 rows=71821 width=58)"
"              Sort Key: m.market, p.symbol, s.company_name"
"              ->  Hash Join  (cost=150.12..4201.89 rows=71821 width=58)"
"                    Hash Cond: (p.symbol = s.symbol)"
"                    ->  Seq Scan on stock_price p  (cost=0.00..2588.04 rows=149104 width=8)"
"                    ->  Hash  (cost=121.85..121.85 rows=2262 width=54)"
"                          ->  Hash Join  (cost=1.02..121.85 rows=2262 width=54)"
"                                Hash Cond: ((s.market)::text = (m.market)::text)"
"                                ->  Seq Scan on stock_symbols s  (cost=0.00..81.24 rows=4524 width=32)"
"                                ->  Hash  (cost=1.01..1.01 rows=1 width=28)"
"                                      ->  Seq Scan on market m  (cost=0.00..1.01 rows=1 width=28)"
"  ->  Hash  (cost=15928.98..15928.98 rows=1 width=4)"
"        ->  Aggregate  (cost=15928.96..15928.97 rows=1 width=4)"
"              ->  GroupAggregate  (cost=12697.02..15031.20 rows=71821 width=58)"
"                    Group Key: m_1.market, p_1.symbol, s_1.company_name"
"                    ->  Sort  (cost=12697.02..12876.57 rows=71821 width=58)"
"                          Sort Key: m_1.market, p_1.symbol, s_1.company_name"
"                          ->  Hash Join  (cost=150.12..4201.89 rows=71821 width=58)"
"                                Hash Cond: (p_1.symbol = s_1.symbol)"
"                                ->  Seq Scan on stock_price p_1  (cost=0.00..2588.04 rows=149104 width=8)"
"                                ->  Hash  (cost=121.85..121.85 rows=2262 width=54)"
"                                      ->  Hash Join  (cost=1.02..121.85 rows=2262 width=54)"
"                                            Hash Cond: ((s_1.market)::text = (m_1.market)::text)"
"                                            ->  Seq Scan on stock_symbols s_1  (cost=0.00..81.24 rows=4524 width=32)"
"                                            ->  Hash  (cost=1.01..1.01 rows=1 width=28)"
"                                                  ->  Seq Scan on market m_1  (cost=0.00..1.01 rows=1 width=28)"

Could anyone help me? Thank you very much!!!!

`market` seems to only contain a single row - or at least that's what Postgres estimates. For a single row table, the database is never going to use an index lookup. `stock_symbols` is also expected to contain only 4500 rows so most probably the seq scan is more efficient than an index lookup as well. To get more details you should post the output of `explain (analyze, verbose)` to see which step is slow. — , Jan 08 '16 at 10:26

Optimize PostgreSQL query with join, group by

0 Answers0