Restructure SQL to enable query-planer to use result from subquery efficiently

Question

I have a query:

select * 
from tableA 
where colA > (select max(colA) from tableB)

This is terribly slow compared to:

select * 
from tableA 
where colA > 10

There are indices on tableA and tableB on colA.

If it matters, this problem occurs in Postgres, a db-independent solution would be nice.

query plan:

Seq Scan on "tableA" a  (cost=0.47..145224.62 rows=1256724 width=178)
  Filter: (colA > $1)
  InitPlan 2 (returns $1)
    ->  Result  (cost=0.46..0.47 rows=1 width=8)
          InitPlan 1 (returns $0)
            ->  Limit  (cost=0.43..0.46 rows=1 width=8)
                  ->  Index Only Scan using idx_colA_desc on "tableB" b  (cost=0.43..106076.56 rows=3770169 width=8)
                        Index Cond: (colA IS NOT NULL)

indeces:

CREATE INDEX ON tableA USING btree (colA DESC)
CREATE INDEX ON tableB USING btree (colA DESC)

Much more interesting is how does both tables look likey which indexes you have — nbk, Aug 31 '23 at 06:25
Please provide at least the query plan (explain analyse) and the indexes for performance related questions — Jim Jones, Aug 31 '23 at 06:29
The queries are completely different so you can't compare them. It would make more sense to compare the time needed to calculate the `max` value, store it in a variable and use that result in the second query. Is `SELECT MAX()` the slow query? The plan suggest that's the expensive part — Panagiotis Kanavos, Aug 31 '23 at 07:10
"select max(colA) from tableB" is instant because of the index. i cant to it in two queries, otherwise it would be trivial. the slow part is that the query planer does not use the index because it does not know the result of the subquery. — Vertago, Aug 31 '23 at 10:12

Restructure SQL to enable query-planer to use result from subquery efficiently

0 Answers0