1

Modifying hive query programmatically- I am parsing hive query using ParseDriver.parse() method to get the parsed ASTNode tree. The use case to add some where clauses to it for row level security.

Now that I have modified the parse tree, Is there any existing method to convert it back to hive query string? I understand modifying the parse tree can create problems because it stores indices of original string also. One method is to do manual traversal to the tree and constructing string,

1 Answers1

4

My Experience with modifying hive queries:
I also wanted to programatically modify hive queries for stuff like getting the tables accessed in the query,changing the where clause to add additional conditions,introducing additional joins and so on. After initially trying to play around with antlr3(which hive parser uses) and trying to modify the ASTs I realized what I wanted to do could be very easily achieved with antlr4. So I set out to modify the existing hive grammar to antlr4 only to realize that a good samaritan had already done that :https://github.com/apache/tajo/tree/branch-0.8.1/tajo-core/src/main/antlr4/org/apache/tajo/engine/parser. So;after that it was a matter of getting hand to antlr4 book to learn more stuff; using antlr plugin to generate source from pom,extending the generated listener and using TokenRewriteStream to have the queries modified. Also if you have problem of vanishing spaces you may need to modify grammar slightly: ANTLR4: TokenStreamRewriter output doesn't have proper format (removes whitespaces)

Community
  • 1
  • 1
sourabh
  • 466
  • 4
  • 13