1

Hi guys i just started doing pig, I was wondering if JsonLoader is capable of parsing all value inside json.

for example:

{"food":"Tacos", "person":"Alice", "amount":3}

and i need to get "food" stored as a relation in chararray and "Tacos" which is the value of "food" to another relation.

after reading many tutorial and documentation, i havent found a built in method to do so.

does it mean that the only solution to this is through UDF?

thanks a lot!

Zelldon
  • 5,396
  • 3
  • 34
  • 46
kenlz
  • 461
  • 7
  • 22
  • 1
    @Zelldon UDF = User Defined Function. If you don't know about Apache Pig, please don't post rude comments. – Balduz Sep 03 '15 at 15:29

2 Answers2

1

I found the answer that is to use external jar from twitter.

register 'hdfs:/udf/elephant-bird-pig-4.10.jar';
register 'hdfs:/udf/elephant-bird-core-4.10.jar';
register 'hdfs:/udf/elephant-bird-hadoop-compat-4.10.jar';
register 'hdfs:/udf/json-simple-1.1.1.jar';

test.json

{"food":"Tacos", "person":"Alice", "amount":3}

script:

A = LOAD 'hdfs:/test.json' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') AS (json:map[]); 
DUMP A;

and the output that i wanted is:

([amount#3,food#Tacos,person#Alice])

Thanks!

kenlz
  • 461
  • 7
  • 22
0

input- pig.json

{"food":"Tacos", "person":"Alice", "amount":3}

script

A = LOAD '/home/kishore/Data/Pig/pig.json' USING JsonLoader('food:chararray,person:chararray,amount:int');
B = foreach A generate food,person,amount;
Dump B;

output

(Tacos,Alice,3)
Kishore
  • 5,761
  • 5
  • 28
  • 53
  • thanks for answering, but how do you get the 'name field' which is "food" to a relation? because there are scenarios in field name may change so i need to get both name & value field to pair them with map. – kenlz Sep 03 '15 at 10:52