1

In my case I'm querying data from multi data source(like csv+mysql) via a single sql. How can I distinguish the data source for tables and detect what columns are queried on tables by using Calcite? (Meta data of data source available)

Result that I required is something like:
- TableA(col1, col2, col3) -> Data source CSV
- TableB(col1, colx, coly) -> Data source Mysql

My case is something like what Apache Drill(uses Calcite) does, I tried read Drill source code but I cannot find the way how Drill decides the relations.

String sql = "select c.c1, m.c2 from csv.tbl as c, mysql.schema.tbl as m where c.id = m.id”;

Frameworks.ConfigBuilder configBuilder = Frameworks.newConfigBuilder();
configBuilder.defaultSchema(`my SchemaPlus here`);
FrameworkConfig frameworkConfig = configBuilder.build();
Planner planner = Frameworks.getPlanner(frameworkConfig);

SqlNode sqlNode = planner.parse(sql);
planner.validate(sqlNode);
RelRoot relRoot = planner.rel(sqlNode);

This is what now I have, but it seems nothing I wanted there ~_~|||

thannks a lot.

iterrole
  • 13
  • 3

1 Answers1

1

If your questions is whether Calcite can automatically decipher what columns you're using if you don't put that information in your SQL query, it can't. It will assume you're using your default schema and try to map it there. If you're using multiple schemas, it's stupid (not in the bad way) and you have to tell it what to do. You have to write your SQL query so that it contains that information, just like you did.

If you want to extract that information, you have to do it using RelVisitor, like I did in my master thesis. You can find the code here and the related issue here

Zoe
  • 27,060
  • 21
  • 118
  • 148
igrgurina
  • 50
  • 7
  • Thank you it helps. Is it possible to get the qualifiedName of columns? For sql `select name as username, s.salary as salary from emp e, salaries s`, I got the filed name `username` from RelNode. If I haven a qualified field `emp.name` it will help to implicate that it is from the table `emp`. – iterrole May 28 '19 at 02:20
  • @iterrole Since every Project has to be associated with TableScan of some sort, even when joining, you just have to find the right table(s) your Project is associated with. Look at line #113 in the code link posted above. – igrgurina May 28 '19 at 14:13