I'm running a pig script on EMR that reads data stored in Avro format. It had been working locally, but to get other parts of the script to work on EMR, I had to revert the piggybank.jar I was using to 0.9.2 instead of 0.10.0. After making that change, AvroStorage silently fails to read any data and just returns zero records. Nothing mentioned in logs or anything. Here's the script:
REGISTER ../../../lib/avro-1.7.0.jar
REGISTER ../../../lib/json-simple-1.1.1.jar
REGISTER ../../../lib/jackson-core-asl-1.5.2.jar
REGISTER ../../../lib/jackson-mapper-asl-1.5.2.jar
REGISTER ../../../lib/piggybank.jar
a = LOAD '/data/' USING org.apache.pig.piggybank.storage.avro.AvroStorage();
DUMP a;
And again, if piggybank.jar is version 0.10.0, it works. If it is version 0.9.2, it does not. Should I be using a different version of any of the other libraries? I tried with avro-1.5.3.jar, and that also did not work.
Anothr note: if I do describe a;
it correctly outputs the schema.