0

What is the proper way to implement JOIN rewrite to allow query to be fed with results of subquery? For example:

SELECT state  FROM zips_view WHERE j IN (select j from people_view)

This query gets rewritten to SemiJoin which executes table scan inner query (select agains _people_view_ as expected) then again table scan for outer query (select against _zips_view_). The second scan can be replaced with filtered query e.g.

SELECT state FROM zips_iew WHERE j IN (1,2,3,4)

What's the proper way to implement a "two phase" JOIN which takes results of subquery and adds them to outer query as filter/condition?

  • Possible duplicate of [Join query in ElasticSearch](https://stackoverflow.com/questions/22611049/join-query-in-elasticsearch) – Mišo May 01 '19 at 10:03

1 Answers1

0

I'm using JDBC connector here, and for your desired query:

SELECT state FROM zips_iew WHERE j IN (1,2,3,4)

it generates this relational algebra:

LogicalProject(state=[$0])
  LogicalFilter(condition=[OR(=($0, 1), =($0, 2), =($0, 3), =($0, 4))])
    JdbcTableScan(table=[[zips_iew, state]])

You should start by writing rules that transform your original relational algebra (with SemiJoin) and work your way down to get the relational algebra that looks like above.

igrgurina
  • 50
  • 7