0

Sample json document:

{
  "chats": [
    {
      "chatID": 123,
      "agentComments": "[{\"agentID\":\"agent1\", \"queueID\":\"queue1\", \"comment\":\"Visitor's query not relevant for this queue.\"}, {\"agentID\":\"agent2\", \"queueID\":\"queue2\", \"comment\":\"Resolved\"}]"
    }
  ]
}

Result required:

+---------+---------+-----------------------------------------------+
| chatID  | agentID |                    Comment                    |
+---------+---------+-----------------------------------------------+
| 123     | agent1  | visitor's query not relevant for this queue.  |
| 123     | agent2  | Resolved                                      |
+---------+---------+-----------------------------------------------+

Value for agentComments in the sample document given above is a stringified JSON. Had it been a JSONArray, above result could be obtained using something like

select B.chatID, B.agent.agentID agentID, B.agent.comment comment from (select A.chat.chatID chatID, flatten(A.chat.agentComments) agent from (select flatten(chats) chat from dfs.`/tmp/test.json`)A)B;

Please share a sample user defined function, in the context of apache drill, that works similar to JSON.parse available in Javascript.

Dev
  • 13,492
  • 19
  • 81
  • 174
Zrest
  • 1
  • 1
  • Have you seen drill [docs](https://drill.apache.org/docs/develop-custom-functions/)? – Dev Apr 16 '16 at 17:48

2 Answers2

2

You can write a Simple UDF, Parse the Input JSON and return String.

Here is the documentation on How to https://drill.apache.org/docs/develop-custom-functions-introduction/

Here is the sample code to take JSON as the input and return VarChar

@FunctionTemplate(names = { "jcontains" }, scope = FunctionTemplate.FunctionScope.SIMPLE, nulls = FunctionTemplate.NullHandling.NULL_IF_NULL, isRandom = true)
public class JsonContains implements DrillSimpleFunc {

@Param
FieldReader rowArray;
@Param
VarCharHolder containStr;


@Output
VarCharHolder output;

public void setup() {
}

public void eval() {
    try {
        String rowString = rowArray.readObject().toString();
        String conVal = org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder(containStr);

        output.value = //extract your string and set the value;
    } catch (Exception e) {
        e.printStackTrace();
        throw e;
    }
  }
}
vgunnu
  • 826
  • 8
  • 6
1

You can also solve this in Drill without a UDF by using the CONVERT_FROM function to parse the stringified JSON:

0: jdbc:drill:zk=local> select ttt.chatId as chatID, ttt.agentComment.agentId as agentID, ttt.agentComment.comment as Comment from (select tt.chatId, flatten(tt.agentComments) as agentComment from (select t.chat.chatID as chatID, convert_from(t.chat.agentComments, 'JSON') as agentComments from ( select flatten(chats) as chat from `chats.json`) t) tt) ttt;
+---------+----------+-----------------------------------------------+
| chatID  | agentID  |                    Comment                    |
+---------+----------+-----------------------------------------------+
| 123     | agent1   | Visitor's query not relevant for this queue.  |
| 123     | agent2   | Resolved                                      |
+---------+----------+-----------------------------------------------+
2 rows selected (0.188 seconds)

Here's the SQL nicely formatted and hopefully easier to read:

select
   ttt.chatId as chatID,
   ttt.agentComment.agentId as agentID,
   ttt.agentComment.comment as Comment 
from
   (
      select
         tt.chatId,
         flatten(tt.agentComments) as agentComment 
      from
         (
            select
               t.chat.chatID as chatID,
               convert_from(t.chat.agentComments, 'JSON') as agentComments 
            from
               (
                  select
                     flatten(chats) as chat 
                  from
                     `chats.json`
               )
               t
         )
         tt
   )
   ttt;
Vince Gonzalez
  • 380
  • 1
  • 9