1

I have a file formatted as JSON array per line.

Something like

["6400000000",{"status":"FINE","ok":"false","addresses":"00:00:00:00:00:00"}]
["4900000000",{"status":"FINE","ok":"true","addresses":"00:00:00:00:00:00"}]

i'm running the following on Amazon EMR:

register 's3://mybucket/jar/elephant-bird-core-4.9.jar';
register 's3://mybucket/jar/elephant-bird-pig-4.9.jar';
register 's3://mybucket/jar/elephant-bird-hadoop-compat-4.9.jar';
register 's3://mybucket/jar/json-simple-1.1.jar';

sample = load 's3://mybucket/data/sample.json' using com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') as (json:map[]);

dump sample;

I'm getting the following error for each line in JSON:

java.lang.ClassCastException: org.json.simple.JSONArray cannot be cast to org.json.simple.JSONObject
    at com.twitter.elephantbird.pig.load.JsonLoader.parseStringToTuple(JsonLoader.java:158)
    at com.twitter.elephantbird.pig.load.JsonLoader.getNext(JsonLoader.java:129)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204)
    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:562)
    at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
    at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:151)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:171)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:166)

Am I missing anything?

0 Answers0