2

I had a specific filtering problem (described here: Pig - How to manipulate and compare dates?), so as we told me, I decided to write my own filtering UDF. Here is the code:

import java.io.IOException;

import org.apache.pig.FilterFunc;
import org.apache.pig.data.Tuple;

import org.joda.time.*;
import org.joda.time.format.*;

public class DateCloseEnough extends FilterFunc {


int nbmois;

/*
 * @param nbMois: if the number of months between two dates is inferior to this variable, then we consider that these two dates are close
 */
public DateCloseEnough(String nbmois_) {
    nbmois = Integer.valueOf(nbmois_);
}

public Boolean exec(Tuple input) throws IOException {

    // We're getting the date
    String date1 = (String)input.get(0);

    // We convert it into date
    final DateTimeFormatter dtf = DateTimeFormat.forPattern("MM yyyy");
    LocalDate d1 = new LocalDate();
    d1 = LocalDate.parse(date1, dtf);
    d1 = d1.withDayOfMonth(1);

    // We're getting today's date
    DateTime today = new DateTime();
    int mois = today.getMonthOfYear();
    String real_mois;
    if(mois >= 1 && mois <= 9) real_mois = "0" + mois;
    else real_mois = "" + mois;

    LocalDate d2 = new LocalDate();
    d2 = LocalDate.parse(real_mois + " " + today.getYear(), dtf);
    d2 = d2.withDayOfMonth(1);

    // Number of months between these two dates
    String nb_months_between = "" + Months.monthsBetween(d1,d2);

    return (Integer.parseInt(nb_months_between) <= nbmois);

}



}

I created a Jar file of this code from Eclipse.

I'm filtering my data with these lines of piglatin code:

REGISTER Desktop/myUDFs.jar
DEFINE DateCloseEnough DateCloseEnough('12');

experiences1 = LOAD '/home/training/Desktop/BDD/experience.txt' USING PigStorage(',') AS (id_cv:int, id_experience:int, date_deb:chararray, date_fin:chararray, duree:int, contenu_experience:chararray);

experiences = FILTER experiences1 BY DateCloseEnough(date_fin);

I'm launching my program with this linux command:

pig -x local "myScript.pig"

And I get this error:

2013-06-19 07:27:17,253 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/training/pig_1371652037252.log
2013-06-19 07:27:17,933 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/joda/time/ReadablePartial Details at logfile: /home/training/pig_1371652037252.log

I checked into the log file and I saw this:

Pig Stack Trace

ERROR 2998: Unhandled internal error. org/joda/time/ReadablePartial

java.lang.NoClassDefFoundError: org/joda/time/ReadablePartial
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:441)
at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:471)
at org.apache.pig.impl.PigContext.instantiateFuncFromAlias(PigContext.java:544)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.EvalFuncSpec(QueryParser.java:4834)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.PUnaryCond(QueryParser.java:1949)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.PAndCond(QueryParser.java:1790)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.POrCond(QueryParser.java:1734)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.PCond(QueryParser.java:1700)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.FilterClause(QueryParser.java:1548)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1276)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:893)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:682)
at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1031)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:981)
at org.apache.pig.PigServer.registerQuery(PigServer.java:383)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:717)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:273)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
at org.apache.pig.Main.main(Main.java:320)
Caused by: java.lang.ClassNotFoundException: org.joda.time.ReadablePartial
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
... 24 more

I tried to modify my PIG_CLASSPATH variable but i figured out that this variable doesn't exist at all (some other pig scripts are working though).

Do you have an idea to solve te problem ?

Thanks.

Community
  • 1
  • 1
shanks_roux
  • 438
  • 2
  • 12
  • 26
  • See this answer: [how to include external jar file using PIG](http://stackoverflow.com/questions/10423990/how-to-include-external-jar-file-using-pig/16162785#16162785) – zsxwing Jun 20 '13 at 01:36
  • @zsxwing Thanks but it doesn't change anything. – shanks_roux Jun 20 '13 at 07:54
  • You added joda-time-2.2.jar by using `register path_to_joda-time-2.2.jar`? – zsxwing Jun 20 '13 at 08:02
  • @zsxwing no I included it in my java code – shanks_roux Jun 20 '13 at 08:04
  • How do you include it in your java code? Unpack the joda-time-2.2.jar to merge to your jar, Or just put the whole joda-time-2.2.jar in your jar? The latter will not work as Pig do not handle such case. – zsxwing Jun 20 '13 at 08:07
  • @zsxwing I include it via eclipse (build path -> configure build path -> add external jars). I tried your solution though and it seems to work whereas I have a new error "ERROR 1002 Unable to store alias experiences" when I try to dump my "experiences" variable. I think It's because my UDF code is wrong. – shanks_roux Jun 20 '13 at 08:14
  • configure build path to add it in eclipse is not enough. Eclipse will not help you generate the correct jar. So what's your new error? Can you paste your error log? – zsxwing Jun 20 '13 at 08:19
  • @zsxwing ERROR 1002: Unable to store alias experiences org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias experiences at org.apache.pig.PigServer.openIterator(PigServer.java:479) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:536) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142) – shanks_roux Jun 20 '13 at 08:23
  • Do you have the full stack track? Maybe there are some null value in your input, or some date parsing exception. – zsxwing Jun 20 '13 at 08:27
  • @zsxwing I can't post the full stacktrace in a comment. I'll post a new answer – shanks_roux Jun 20 '13 at 08:31

2 Answers2

1

At first, you need to tell Pig which jar you are using. See this answer: how to include external jar file using PIG. Configure build path to add it in eclipse is not enough. Eclipse will not help you generate the correct jar.

Secondly, String nb_months_between = "" + Months.monthsBetween(d1,d2); is wrong. You can use int nb_months_between = Months.monthsBetween(d1,d2).getMonths();. If you read the Months.toString, it returns "P" + String.valueOf(getValue()) + "M";. So you can not use this value and want to convert it to a int.

Community
  • 1
  • 1
zsxwing
  • 20,270
  • 4
  • 37
  • 59
  • Great It works. A little precision though: the command "pig -Dpig.additional.jars=/local/path/to/your.jar" doesn't work for me, you have to use "register /local/path/to/your.jar;" in your pig script. Thanks. – shanks_roux Jun 20 '13 at 08:49
  • Sorry, `-Dpig.additional.jars` do not require absolute path. Can you paste the command which you use to run your script with `-Dpig.additional.jars`? For example, `pig -Dpig.additional.jars=abc.jar:efg.jar -f abc.pig` – zsxwing Jun 20 '13 at 08:54
  • I was only doing "pig -Dpig.additional.jars=abc.jar:efg.jar" without launching the script. Thats why it didn't work – shanks_roux Jun 20 '13 at 08:58
0

u need this package: org/joda/time/ReadablePartial

can find here: jarfinder download the joda-time-1.5.jar. Add to your project, this to should resolve.