SQL Dialect Translation Using PERL

Question

I have a rush job to take an SQL product that works on PostGre and Amazon's RedShift, and port it to work on Oracle. (Note that "work" does not currently imply "optimally" but rather "correctly".)

All of our TABLEs are defined using an ORM (DBI), but all of our VIEWs are currently stored as in-line SQL.

My hope is that there is a relatively simple way to run encapsulate the SQL in a kind of generic sense and use some existing tools to translate it in to Dialect Specific SQL.

Trivial Example:

SELECT CAST(x AS DECIMAL(16,4)) AS foo FROM bah
=>
SELECT CAST(x AS NUMBER(16,4)) AS foo FROM bah

That was simple. We have a "find and replace stage" when we deploy. The SQL is stored as CAST(x AS #DECIMAL#(16,4)) and then we replace #DECIMAL# with a new string depending on what dialect we are deploying to.

Frustrating Example:

SELECT x % y AS modulo FROM foo
=>
SELECT MOD(x, y) AS modulo FROM foo

And...

SELECT x / y AS int_div FROM foo
=>
SELECT trunc(x / y) AS int_div FROM foo

I'm no PERL expert, so I'm looking for pointers on how to do some kind of macro expansion.
- The SQL would be stored with some kind of macro expression in the string
- A "processor" would be called with parameters including "dialect" and "sql"
- The "dialect" parameter would dictate the output of the macro expansion

Dialectless-SQL: SELECT #DIV(x,y)# AS z FROM foo

Dialect: RedShift
Output : SELECT x / y AS z FROM foo

Dialect: Oracle11g
Output : SELECT floor(x / y) AS z FROM foo

It would need to cope with sub-queries, etc:

Dialectless-SQL: SELECT #MOD(x, (SELECT MAX(y) FROM bah))# AS z FROM foo

Dialect: RedShift
Output : SELECT x % (SELECT MAX(y) FROM bah) AS z FROM foo

Dialect: Oracle11g
Output : SELECT mod(x, (SELECT MAX(y) FROM bah)) AS z FROM foo

Or any reliable method for storing "generic" SQL, and being able to translate it to "dialect specific" SQL.

You have a challenge on your hands here. It sounds like you want to support multiple dialects simultaneously, rather than simply converting one to another? I think it's important that you start by writing a comprehensive set of tests, as your goal is a little imprecise at present. I also suggest that you restrict your baseline SQL dialect to functions rather than operators (i.e. `MOD(x, y)` rather than `x % y`) including basic operations like addition. That will make the job of parsing the input very much simpler. — Borodin, Jan 12 '15 at 13:44
You will no doubt find a parsing module like [`Parse::RecDescent`](https://metacpan.org/module/Parse::RecDescent) useful, as the task involves two steps: reducing the input to a parse tree and expanding that parse tree to an SQL dialect. You will have one front-end and two (or more?) back-ends. — Borodin, Jan 12 '15 at 13:47
@Borodin - We already have a test system in place that will be able to verify this; building the DDL from our ORM, deploying to a new database, processing test data sets through ETL, querying views for expected results. I do agree with the "dialect-less / macro" SQL having functions *(such as my last examples using `#DIV(x, y)#` and `#MOD(x, y)#`)*. Our front-end will need to "translate" its SQL in the same fashion. *(One "generic" front-end, one "generic" back-end. Many instances/installations, with the dialect being "applied" to the SQL at install time.)* — MatBailie, Jan 12 '15 at 14:34
Isn't this one of the main things ORMs like [`DBIx::Class`](https://metacpan.org/pod/DBIx::Class) do? They create an abstraction layer so you don't have to worry about writing vendor-specific queries. Of course, if you're pressed for time and your existing code base already uses raw SQL, switching to an ORM is probably not a viable option for you, but maybe something to think about in the future. — ThisSuitIsBlackNot, Jan 12 '15 at 15:26
@ThisSuitIsBlackNot - That's true for the tables, unfortunately the VIEWs are complex are defined with in-line SQL statements. — MatBailie, Jan 12 '15 at 16:17

score 3 · Answer 1 · answered Jan 12 '15 at 13:52

3

You might find the SQL::Translator module to be useful for this.

SQL::Translator is a group of Perl modules that converts vendor-specific SQL table definitions into other formats, such as other vendor-specific SQL, ER diagrams, documentation (POD and HTML), XML, and Class::DBI classes.

answered Jan 12 '15 at 13:52

Dave Cross

68,119
3
51
97

We already use DBI, so the DDL for tables is straight forward enough. It's the VIEWs and any in-line SQL embedded in the product that require translation at deploy-time. Thanks for the pointer to SQL::Translator, do you find it reliable in being able to parse human generated SQL, potentially including positional parameter tokens? – MatBailie Jan 12 '15 at 14:38
I've never used it on particularly complex SQL, so I can't really comment, I'm afraid. – Dave Cross Jan 12 '15 at 14:40
1

I would jump at a chance to use something called `SQLFairy`! – Borodin Jan 12 '15 at 15:13

SQL Dialect Translation Using PERL

1 Answers1