I'm using Apache Calcite to parse and validate some arbitrary SQL. It works for most cases but I've tried to use Unicode characters and hit some bumps.
eg;
String sql = "SELECT '®'";
SqlParser.Config config = SqlParser.configBuilder().setConfig(SqlParser.Config.DEFAULT)
.setUnquotedCasing(Casing.UNCHANGED)
.setQuoting(Quoting.BACK_TICK)
.build();
SqlParser parser = SqlParser.create(sql, config);
SqlNode parsed;
try {
parsed = parser.parseQuery();
parsed.toSqlString(MysqlSqlDialect.DEFAULT).getSql();
} catch (Exception e) {
// wheels fall off and catch fire
}
This gives me SELECT u&'\00ae'
which my DB doesn't want to handle.
Is there some way I can configure this to return
SELECT '®'
? I've had a look in the SqlDialect classes and I think the issue occurs here
public void quoteStringLiteral(StringBuilder buf, @Nullable String charsetName,
String val) {
if (containsNonAscii(val) && charsetName == null) {
quoteStringLiteralUnicode(buf, val);
} else {
if (charsetName != null) {
buf.append("_");
buf.append(charsetName);
}
buf.append(literalQuoteString);
buf.append(val.replace(literalEndQuoteString, literalEscapedQuote));
buf.append(literalEndQuoteString);
}
}
Which doesn't seem to give me any way to avoid this behaviour.