0

I need to implement the following and was hoping to get some ideas on how to solve the problem best:

I have several SQL databases and want to offer the option to use a somehow federated database, i.e. a virtual SQL layer is supposed to receive (read-only) queries. These queries should be executed for each of the single databases and merged to a total result and send back to the sender of the query.

So, the virtual SQL layers task would be to receive the initial query, call the single databases, and merge the result. Additionally for privacy reasons the virtual SQL layer needs to implement rules for removing specific results. Thus I need to be able to write my own code in the virtual layer.

Technology-wise I am open, however, Java would be preferred. So far, I am considering taking the H2 project as a base and make modifications there.

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
felix_w
  • 15
  • 7
  • Look for your RDBMS features, maybe this "virtual" call to many databases is already supported. If not, then just program it. You receive a query in a program, you send it via a library to all other databases. Asynchronously wait for results and merge them. Optionally serialize it and send a response. Not a tedious thing to program, yet has some security caveats – Alexey S. Larionov Aug 14 '20 at 15:07

1 Answers1

1

To do this using H2 constructs, you're going to want to implement a TableEngine. The features documentation says a little bit more about this. Hacking the actual H2 source code should be unnecessary.

For a project that does this already, see Apache Calcite: it parks a SQL parser and query optimizer on top of various notions of what are basically key-value stores.

Laird Nelson
  • 15,321
  • 19
  • 73
  • 127