I'm trying to use datafu.pig.stats.StreamingQuantile
in LinkedIn's great DataFu library. However, I get the following error from Pig when it reaches the first StreamingQuantile
usage:
2013-08-03 00:55:45,294 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/apache/pig/Accumulator
In the logfile, I see the following:
Pig Stack Trace
---------------
ERROR 2998: Unhandled internal error. org/apache/pig/Accumulator
java.lang.NoClassDefFoundError: org/apache/pig/Accumulator
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:792)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
(snip)
I've tried using register
to include guava.jar and also pig.jar. Neither one helps. As a belt-and-suspenders approach, I tried including several jars from the lib:
register file:/home/hadoop/lib/guava-13.0.1.jar
register file:/home/hadoop/lib/pig/pig-0.11.1.1-amzn.jar
register file:/home/hadoop/lib/pig/pig.jar
register s3://my-s3-location/datafu-0.0.10.jar
register file:/home/hadoop/lib/pig/piggybank.jar
This doesn't seem to be common. I mean, obviously NoClassDefFoundError is common, but not Accumulator, especially with DataFu. Here's the closest question on stackoverflow but it's related to hbase and I couldn't find anything that helped out. The only answer to this question also indicates something I tried, unfortunately.